What is Docker

Why do we care?

Scenario 1: You have a Windows machine, but want to learn the open source toolchains on Linux.

Scenario 2: Your paper gets rejected, because the reviewer wants comparison with an existing method. But software for existing method only runs on Linux.

Scenario 3: You develop a piece of software. You want to debug/test on different versions of R, on different OS (MacOS, Linux, Windows).

Scenario 4: Reproducible research. Hardware and software evolve fast. Simulation results in research papers are often hard to reproduce due to the changing computing environment. We can use Docker to containerize a simulation experiment (specific versions of OS and software), which can reproduce same results in any future moment.

Learning objectives

Tutorial

We will follow the tutorial Get started with Docker to:

Installation

Download and install the Docker CE (Community Edition) on your computer.

Part 2: containerize a web app

We use Docker to develop an app that serves a webpage.

Essential components

cat requirements.txt
## Flask
## Redis
  • app.py is the Python code for serving a webpage.
cat app.py
## from flask import Flask
## from redis import Redis, RedisError
## import os
## import socket
## 
## # Connect to Redis
## redis = Redis(host="redis", db=0, socket_connect_timeout=2, socket_timeout=2)
## 
## app = Flask(__name__)
## 
## @app.route("/")
## def hello():
##     try:
##         visits = redis.incr("counter")
##     except RedisError:
##         visits = "<i>cannot connect to Redis, counter disabled</i>"
## 
##     html = "<h3>Hello {name}!</h3>" \
##            "<b>Hostname:</b> {hostname}<br/>" \
##            "<b>Visits:</b> {visits}"
##     return html.format(name=os.getenv("NAME", "world"), hostname=socket.gethostname(), visits=visits)
## 
## if __name__ == "__main__":
##     app.run(host='0.0.0.0', port=80)
  • Dockerfile instructs Docker how to put things together in a container:
cat Dockerfile
## # Use an official Python runtime as a parent image
## FROM python:2.7-slim
## 
## # Set the working directory to /app
## WORKDIR /app
## 
## # Copy the current directory contents into the container at /app
## ADD . /app
## 
## # Install any needed packages specified in requirements.txt
## RUN pip install --trusted-host pypi.python.org -r requirements.txt
## 
## # Make port 80 available to the world outside this container
## EXPOSE 80
## 
## # Define environment variable
## ENV NAME World
## 
## # Run app.py when the container launches
## CMD ["python", "app.py"]

See python on Docker Hub for details on the python:2.7-slim image.

See Dockerfile reference for commands in Dockerfile.

Build the app

Build the image:

docker build -t friendlyhello .
## Sending build context to Docker daemon  3.088MB


## Step 1/7 : FROM python:2.7-slim
## 2.7-slim: Pulling from library/python
## a5a6f2f73cd8: Pulling fs layer
## 8da2a74f37b1: Pulling fs layer
## 09b6f498cfd0: Pulling fs layer
## f0afb4f0a079: Pulling fs layer
## f0afb4f0a079: Waiting
## 8da2a74f37b1: Retrying in 5 seconds
## 8da2a74f37b1: Retrying in 4 seconds
## 8da2a74f37b1: Retrying in 3 seconds
## 8da2a74f37b1: Retrying in 2 seconds
## 8da2a74f37b1: Retrying in 1 second
## 8da2a74f37b1: Retrying in 10 seconds
## 8da2a74f37b1: Retrying in 9 seconds
## 8da2a74f37b1: Retrying in 8 seconds
## 8da2a74f37b1: Retrying in 7 seconds
## 8da2a74f37b1: Retrying in 6 seconds
## 8da2a74f37b1: Retrying in 5 seconds
## 8da2a74f37b1: Retrying in 4 seconds
## 8da2a74f37b1: Retrying in 3 seconds
## 8da2a74f37b1: Retrying in 2 seconds
## 8da2a74f37b1: Retrying in 1 second
## 8da2a74f37b1: Download complete
## f0afb4f0a079: Verifying Checksum
## f0afb4f0a079: Download complete
## 09b6f498cfd0: Verifying Checksum
## 09b6f498cfd0: Download complete
## a5a6f2f73cd8: Verifying Checksum
## a5a6f2f73cd8: Download complete
## a5a6f2f73cd8: Pull complete
## 8da2a74f37b1: Pull complete
## 09b6f498cfd0: Pull complete
## f0afb4f0a079: Pull complete
## Digest: sha256:f82db224fbc9ff3309b7b62496e19d673738a568891604a12312e237e01ef147
## Status: Downloaded newer image for python:2.7-slim
##  ---> 0dc3d8d47241
## Step 2/7 : WORKDIR /app
##  ---> Running in aeb74fe93006
## Removing intermediate container aeb74fe93006
##  ---> 10d2bf4dab22
## Step 3/7 : ADD . /app
##  ---> 35fea18ae8fa
## Step 4/7 : RUN pip install --trusted-host pypi.python.org -r requirements.txt
##  ---> Running in 4d24a79d46b6
## Collecting Flask (from -r requirements.txt (line 1))
##   Downloading https://files.pythonhosted.org/packages/7f/e7/08578774ed4536d3242b14dacb4696386634607af824ea997202cd0edb4b/Flask-1.0.2-py2.py3-none-any.whl (91kB)
## Collecting Redis (from -r requirements.txt (line 2))
##   Downloading https://files.pythonhosted.org/packages/f5/00/5253aff5e747faf10d8ceb35fb5569b848cde2fdc13685d42fcf63118bbc/redis-3.0.1-py2.py3-none-any.whl (61kB)
## Collecting itsdangerous>=0.24 (from Flask->-r requirements.txt (line 1))
##   Downloading https://files.pythonhosted.org/packages/76/ae/44b03b253d6fade317f32c24d100b3b35c2239807046a4c953c7b89fa49e/itsdangerous-1.1.0-py2.py3-none-any.whl
## Collecting Jinja2>=2.10 (from Flask->-r requirements.txt (line 1))
##   Downloading https://files.pythonhosted.org/packages/7f/ff/ae64bacdfc95f27a016a7bed8e8686763ba4d277a78ca76f32659220a731/Jinja2-2.10-py2.py3-none-any.whl (126kB)
## Collecting Werkzeug>=0.14 (from Flask->-r requirements.txt (line 1))
##   Downloading https://files.pythonhosted.org/packages/20/c4/12e3e56473e52375aa29c4764e70d1b8f3efa6682bef8d0aae04fe335243/Werkzeug-0.14.1-py2.py3-none-any.whl (322kB)
## Collecting click>=5.1 (from Flask->-r requirements.txt (line 1))
##   Downloading https://files.pythonhosted.org/packages/fa/37/45185cb5abbc30d7257104c434fe0b07e5a195a6847506c074527aa599ec/Click-7.0-py2.py3-none-any.whl (81kB)
## Collecting MarkupSafe>=0.23 (from Jinja2>=2.10->Flask->-r requirements.txt (line 1))
##   Downloading https://files.pythonhosted.org/packages/bc/3a/6bfd7b4b202fa33bdda8e4e3d3acc719f381fd730f9a0e7c5f34e845bd4d/MarkupSafe-1.1.0-cp27-cp27mu-manylinux1_x86_64.whl
## Installing collected packages: itsdangerous, MarkupSafe, Jinja2, Werkzeug, click, Flask, Redis
## Successfully installed Flask-1.0.2 Jinja2-2.10 MarkupSafe-1.1.0 Redis-3.0.1 Werkzeug-0.14.1 click-7.0 itsdangerous-1.1.0
## Removing intermediate container 4d24a79d46b6
##  ---> 3a636718492a
## Step 5/7 : EXPOSE 80
##  ---> Running in af2e4232eabb
## Removing intermediate container af2e4232eabb
##  ---> 05e465feba3c
## Step 6/7 : ENV NAME World
##  ---> Running in a1d819170876
## Removing intermediate container a1d819170876
##  ---> ae3e86762f01
## Step 7/7 : CMD ["python", "app.py"]
##  ---> Running in 2fd6bcee008a
## Removing intermediate container 2fd6bcee008a
##  ---> ae4258770936
## Successfully built ae4258770936
## Successfully tagged friendlyhello:latest

Display the image:

docker image ls
## REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
## friendlyhello       latest              ae4258770936        1 second ago        134MB
## python              2.7-slim            0dc3d8d47241        10 days ago         120MB

Run the app

Run the app by

docker run -p 4000:80 friendlyhello

or in detached mode

docker run -d -p 4000:80 friendlyhello
## 7e6fbb17575ddcaeabc7797f087dd0988ccb24e1a1291e67a92d25935b73cc48

-p 4000:80 maps port 80 of the container to port 4000 of host.

Display the container:

docker container ls
## CONTAINER ID        IMAGE               COMMAND             CREATED                  STATUS                  PORTS                  NAMES
## 7e6fbb17575d        friendlyhello       "python app.py"     Less than a second ago   Up Less than a second   0.0.0.0:4000->80/tcp   amazing_jackson

We now should be able to check the webpage by pointing browser to $HOSTIP:4000.

To stop the container, issue:

docker container stop <CONTAINER_ID>

To kill all containers

docker container kill $(docker container ls -a -q)
## 7e6fbb17575d

then remove them

docker container rm $(docker container ls -a -q)
## 7e6fbb17575d

Share the image

To demonstrate the portability of what we just created, let’s upload our built image and run it somewhere else. After all, you need to know how to push to registries when you want to deploy containers to production.

A registry is a collection of repositories, and a repository is a collection of images—sort of like a GitHub repository, except the code is already built. An account on a registry can create many repositories.

We use Docker’s public registry because it’s free and pre-configured (default by docker CLI).

  • Log in with your Docker ID: Sign up for one at hub.docker.com. Make note of your username.

bash docker login

  • Tag the friendlyhello image:

    docker tag friendlyhello $DOCKERID/get-started:part2
  • Upload the tagged image to registry:

    docker push $DOCKERID/get-started:part2
  • Now the image is up on Docker Hub registry. We can run image (on any machine with Docker installed) from the registry:

    docker run -d -p 4000:80 $DOCKERID/get-started:part2

Part 3: run replicates of a container as service

Services are really just “containers in production.” A service only runs one image, but it codifies the way that image runs—what ports it should use, how many replicas of the container should run so the service has the capacity it needs, and so on.

The following docker-compose.yml specifies:

  • Pull the image $DOCKERID/get-started:part2.

  • Run 5 instances of that image as a service called web, limiting each one to use, at most, 10% of the CPU (across all cores), and 50MB of RAM.

  • Immediately restart containers if one fails.

  • Map port 4000 on the host to web’s port 80.

  • Instruct web’s containers to share port 80 via a load-balanced network called webnet. (Internally, the containers themselves publish to web’s port 80 at an ephemeral port.)

  • Define the webnet network with the default settings (which is a load-balanced overlay network).

cat docker-compose.yml
## version: "3"
## services:
##   web:
##     # replace username/repo:tag with your name and image details
##     image: rickdeckard/get-started:part2
##     deploy:
##       replicas: 5
##       resources:
##         limits:
##           cpus: "0.1"
##           memory: 50M
##       restart_policy:
##         condition: on-failure
##     ports:
##       - "4000:80"
##     networks:
##       - webnet
## networks:
##   webnet:

See Docker Compose reference for commands in Docker Compose.

Run a new load-balanced app in a swarm mode:

docker swarm init
docker stack deploy -c docker-compose.yml getstartedlab
## Swarm initialized: current node (okb265o5lhmv0wbyl4mc0o6fr) is now a manager.
## 
## To add a worker to this swarm, run the following command:
## 
##     docker swarm join --token SWMTKN-1-4to493f3o0wt22fed12wim5xht0mwi1erau70g8v5enhqh8ifu-d8lou9ohsyf5yx1ttfqy54lka 192.168.65.3:2377
## 
## To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
## 
## Creating network getstartedlab_webnet
## Creating service getstartedlab_web

A swarm is a group of machines that are running Docker and joined into a cluster. After that has happened, you continue to run the Docker commands you’re used to, but now they are executed on a cluster by a swarm manager. The machines in a swarm can be physical or virtual. After joining a swarm, they are referred to as nodes.

In this example, we run swarm on a single local machine.

List the service:

docker service ls
## ID                  NAME                MODE                REPLICAS            IMAGE                           PORTS
## yaei9993msa8        getstartedlab_web   replicated          0/5                 rickdeckard/get-started:part2   *:4000->80/tcp

List the tasks for your service:

docker service ps getstartedlab_web
## ID                  NAME                  IMAGE                           NODE                    DESIRED STATE       CURRENT STATE                      ERROR               PORTS
## 6yqjeymqzufw        getstartedlab_web.1   rickdeckard/get-started:part2   linuxkit-025000000001   Running             Accepted less than a second ago                        
## ho5zq5blykj5        getstartedlab_web.2   rickdeckard/get-started:part2   linuxkit-025000000001   Running             Accepted less than a second ago                        
## eevf6y5lyzl7        getstartedlab_web.3   rickdeckard/get-started:part2   linuxkit-025000000001   Running             Accepted less than a second ago                        
## x4hqws736n7y        getstartedlab_web.4   rickdeckard/get-started:part2   linuxkit-025000000001   Running             Preparing less than a second ago                       
## v1nwoj50l08a        getstartedlab_web.5   rickdeckard/get-started:part2   linuxkit-025000000001   Running             Preparing less than a second ago

To take down the service and swarm:

docker stack rm getstartedlab
docker swarm leave --force
## Removing service getstartedlab_web
## Removing network getstartedlab_webnet
## Node left the swarm.

Part 5: run interrelated services as a stack

A stack is a group of interrelated services that share dependencies, and can be orchestrated and scaled together. A single stack is capable of defining and coordinating the functionality of an entire application.

Here we want to add two more services: a visualizer for visualizing services and a redis database for counting webpage visits.

The only thing we need to do is to update the docker-composer.yml file. Let’s name the new file docker-compose-stack.yml in order to avoid confusion:

cat docker-compose-stack.yml
## version: "3"
## services:
##   web:
##     # replace username/repo:tag with your name and image details
##     image: rickdeckard/get-started:part2
##     deploy:
##       replicas: 5
##       restart_policy:
##         condition: on-failure
##       resources:
##         limits:
##           cpus: "0.1"
##           memory: 50M
##     ports:
##       - "4000:80"
##     networks:
##       - webnet
##   visualizer:
##     image: dockersamples/visualizer:stable
##     ports:
##       - "8080:8080"
##     volumes:
##       - "/var/run/docker.sock:/var/run/docker.sock"
##     deploy:
##       placement:
##         constraints: [node.role == manager]
##     networks:
##       - webnet
## networks:
##   webnet:

Then deploy

docker swarm init
## Swarm initialized: current node (pwe1v4t6gsovme399ab1743x1) is now a manager.
## 
## To add a worker to this swarm, run the following command:
## 
##     docker swarm join --token SWMTKN-1-4up6644ocyp00ib03b1ncojwwhlys96tgfn4wlocheoivhv098-cknnqkg3xvpq1pab4fiba715t 192.168.65.3:2377
## 
## To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
docker stack deploy -c docker-compose-stack.yml getstartedlab2
## Creating network getstartedlab2_webnet
## Creating service getstartedlab2_web
## Creating service getstartedlab2_visualizer

List the service:

docker service ls
## ID                  NAME                        MODE                REPLICAS            IMAGE                             PORTS
## pcldwvfx6cfl        getstartedlab2_visualizer   replicated          0/1                 dockersamples/visualizer:stable   *:8080->8080/tcp
## pbhmp6dh115k        getstartedlab2_web          replicated          0/5                 rickdeckard/get-started:part2     *:4000->80/tcp

List the tasks for your service:

docker service ps getstartedlab2_web
## ID                  NAME                   IMAGE                           NODE                    DESIRED STATE       CURRENT STATE              ERROR               PORTS
## wl93nk1e9ij8        getstartedlab2_web.1   rickdeckard/get-started:part2   linuxkit-025000000001   Running             Preparing 15 seconds ago                       
## ou1ul67qsxph        getstartedlab2_web.2   rickdeckard/get-started:part2   linuxkit-025000000001   Running             Preparing 15 seconds ago                       
## ipm0urflf6po        getstartedlab2_web.3   rickdeckard/get-started:part2   linuxkit-025000000001   Running             Preparing 15 seconds ago                       
## hpbjkogv2iyo        getstartedlab2_web.4   rickdeckard/get-started:part2   linuxkit-025000000001   Running             Preparing 15 seconds ago                       
## crlt51bgv3z6        getstartedlab2_web.5   rickdeckard/get-started:part2   linuxkit-025000000001   Running             Preparing 15 seconds ago

Now we can check the new webpage at $HOSTIP:4000

and the visualizer at $HOSTIP:8080 in browser.

Upgrade your service: persist the data

  1. Modify docker-compose-stack.yml as:
cat docker-compose-stack.yml
## version: "3"
## services:
##   web:
##     # replace username/repo:tag with your name and image details
##     image: rickdeckard/get-started:part2
##     deploy:
##       replicas: 5
##       restart_policy:
##         condition: on-failure
##       resources:
##         limits:
##           cpus: "0.1"
##           memory: 50M
##     ports:
##       - "4000:80"
##     networks:
##       - webnet
##   visualizer:
##     image: dockersamples/visualizer:stable
##     ports:
##       - "8080:8080"
##     volumes:
##       - "/var/run/docker.sock:/var/run/docker.sock"
##     deploy:
##       placement:
##         constraints: [node.role == manager]
##     networks:
##       - webnet
##   redis:
##     image: redis
##     ports:
##       - "6379:6379"
##     volumes:
##       - "./data:/data"
##     deploy:
##       placement:
##         constraints: [node.role == manager]
##     command: redis-server --appendonly yes
##     networks:
##       - webnet
## networks:
##   webnet:
- Redis has an official image in the Docker library and has been granted the short image name of just `redis`.
- `redis` always runs on the manager, so it’s always using the same filesystem.
- `redis` accesses an arbitrary directory in the host’s file system as `/data` inside the container, which is where Redis stores data.
- The volume you created that lets the container access `./data` (on the host) as `/data` (inside the Redis container). While containers come and go, the files stored on `./data` on the specified host persists, enabling continuity.
  1. Create a ./data subdirectory:
mkdir ./data
  1. Run docker stack deploy one more time.
docker stack deploy -c docker-compose-stack.yml getstartedlab2
## Updating service getstartedlab2_web (id: pbhmp6dh115krteyr28ti4cj2)
## Updating service getstartedlab2_visualizer (id: pcldwvfx6cfl3qiqzi9nmb5wb)
## Creating service getstartedlab2_redis
  1. Verify that the three services are running as expected.
docker service ls
## ID                  NAME                        MODE                REPLICAS            IMAGE                             PORTS
## 9ecs29rq2tv7        getstartedlab2_redis        replicated          0/1                 redis:latest                      *:6379->6379/tcp
## pcldwvfx6cfl        getstartedlab2_visualizer   replicated          0/1                 dockersamples/visualizer:stable   *:8080->8080/tcp
## pbhmp6dh115k        getstartedlab2_web          replicated          0/5                 rickdeckard/get-started:part2     *:4000->80/tcp
  1. Check the web page at one of your nodes, and take a look at the results of the visitor counter, which is now live and storing information on Redis.

Cleanup

To take down the service and swarm:

docker stack rm getstartedlab2
docker swarm leave --force
## Removing service getstartedlab2_redis
## Removing service getstartedlab2_visualizer
## Removing service getstartedlab2_web
## Removing network getstartedlab2_webnet
## Node left the swarm.

Deploy a stack to GCP

Option 1: Create a container-optimized instance in GCP Compute Engine.

Option 2: On any GCP instance, install Docker and run a container.

# install yum-config-manager
sudo yum install -y yum-utils 
sudo yum install -y yum-config-manager device-mapper-persistent-data lvm2
# add Docker CE repo for CentOS
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
## install Docker CE
sudo yum install docker-ce
  • Run docker:
sudo systemctl start docker
sudo docker run hello-world
}

- Run container:
sudo docker run -d -p 4000:80 rickdeckard/get-started:part2
  • To run the web service, copy docker-compose.yml to the server and run
sudo docker swarm init
sudo docker stack deploy -c docker-compose.yml getstartedlab
  • To run the stack with web+visualizer+redis, copy docker-compose-stack.yml to the server and run
sudo docker swarm init
sudo docker stack deploy -c docker-compose-stack.yml getstartedlab
  • Don’t forget to unblock ports used by the container from the VM firewall. On GCP console, go to VPC network and then Firewall rules, create rules for the web server (tcp: 4000), visualizer (tcp: 8080), and Redis (tcp: 6379). Apply those rules to your VM instance.

  • To take down the service and swarm:

sudo docker stack rm getstartedlab
sudo docker swarm leave --force

Multi-container, multi-machine applications

See part 4 of the tutorial.

Stack: heterogenous containers, multi-machine applications

See part 5 of the tutorial.

Deploy stack to AWS/Azure

See part 6 of the tutorial.

Scenario: Run a Linux container interactively

Run CentOS interactively (as root):

docker run -ti --rm centos:latest

-i means interactive. -t allocates a pseudo-tty. --rm removes the container when it exits.

Run Ubuntu interactively (as root):

docker run -ti --rm ubuntu:latest

Scenario: Run Linux+R on your MacOS/Windows laptop

docker run -ti --rm -v ~/Desktop:/Desktop r-base

It downloads, builds, and runs a Docker image called r-base (Debian + R). -v maps a folder on host to a folder in the container.

docker run -ti --rm r-base /usr/bin/bash
docker run -ti --rm -v "$PWD":/home/docker -w /home/docker -u docker r-base Rscript autoSim.R

-w specifies the working directory. -u specifies the user.

Recap: docker survival commands

Part 1:

## List Docker CLI commands
docker
docker container --help

## Display Docker version and info
docker --version
docker version
docker info

## Excecute Docker image
docker run hello-world

## List Docker images
docker image ls

## List Docker containers (running, all, all in quiet mode)
docker container ls
docker container ls --all
docker container ls -a -q

Part 2:

docker build -t friendlyhello .  # Create image using this directory's Dockerfile
docker run -p 4000:80 friendlyhello  # Run "friendlyname" mapping port 4000 to 80
docker run -d -p 4000:80 friendlyhello         # Same thing, but in detached mode
docker container ls                                # List all running containers
docker container ls -a             # List all containers, even those not running
docker container stop <hash>           # Gracefully stop the specified container
docker container kill <hash>         # Force shutdown of the specified container
docker container rm <hash>        # Remove specified container from this machine
docker container rm $(docker container ls -a -q)         # Remove all containers
docker image ls -a                             # List all images on this machine
docker image rm <image id>            # Remove specified image from this machine
docker image rm $(docker image ls -a -q)   # Remove all images from this machine
docker login             # Log in this CLI session using your Docker credentials
docker tag <image> username/repository:tag  # Tag <image> for upload to registry
docker push username/repository:tag            # Upload tagged image to registry
docker run username/repository:tag                   # Run image from a registry

Part 3:

docker stack ls                                            # List stacks or apps
docker stack deploy -c <composefile> <appname>  # Run the specified Compose file
docker service ls                 # List running services associated with an app
docker service ps <service>                  # List tasks associated with an app
docker inspect <task or container>                   # Inspect task or container
docker container ls -q                                      # List container IDs
docker stack rm <appname>                             # Tear down an application
docker swarm leave --force      # Take down a single node swarm from the manager