Basti’s Buggy Blog

Poking at software.

Setting up Docker in Docker (dind/socket) with GitLab Runners

GitLab’s Continuos Integration / Delivery (CI/CD) workflow utilizes GitLab Runners which can be registered with the GitLab instance to execute individual CI/CD jobs. Depending on the Runner configuration, the jobs are executed in a different environment. Whereas in the Shell Runner commands are executed directly on the runner host, the Docker Runner spawns a new Docker container for each job. This post describes how to setup a Docker Runner that is able to build and run Docker images with bind-mounts.

TL;DR

This GitLab Docker Runner configuration should work for most projects. It uses a socket passthrough and a bind-mount to the /builds directory.

[[runners]]
  name = "docker-socket"
  url = "https://<your gitlab server>"
  token = "<your runner token>"
  executor = "docker"
  [runners.custom_build_dir]
  [runners.cache]
    [runners.cache.s3]
    [runners.cache.gcs]
    [runners.cache.azure]
  [runners.docker]
    tls_verify = false
    image = "docker:edge-git"
    privileged = false
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    # bind-mount the /builds directory to the host + pass through host Docker socket
    volumes = ["/builds:/builds", "/cache", "/var/run/docker.sock:/var/run/docker.sock"]
    shm_size = 0

Table of Contents

dind vs Socket Passthrough

If you want to interact with Docker containers inside another Docker container, you basically have two options: dind and socket passthrough. If you already know what the difference is, skip ahead to the next section.

Socket Passthrough

For understanding the socket passthrough method, it is important to distinguish between the separate applications Docker daemon and Docker CLI. Whenever you interact with the Docker CLI, your commands are forwarded to the Docker daemon. On your local development machine this daemon will likely run on the same machine the CLI is called from. Because no network connection needs to be traversed in this local setup, the UNIX socket /var/run/docker.sock is used for communication between the daemon and the CLI.

You might have wondered why it is necessary to set the “context” in the docker build -f Dockerfile ./context command. Since the Docker daemon, which is responsible for building the image, might run on a different machine which does not have access to the local file system, all the files required for building the Docker image have to be sent to the remote host.

The Docker daemon supports a multitude of interfaces (e.g. TCP, SSH, …) for receiving commands via the network. If the DOCKER_HOST environment variable is set, the CLI will forward all commands to the specified remote Docker daemon.

The idea of socket passthrough is simple: Make the host’s Docker daemon accessible to the Docker CLI inside a container by bind-mounting the docker.sock UNIX socket. This can be achieved by bind-mounting the socket with -v/var/run/docker.sock:/var/run/docker.sock. The Docker CLI inside the container will send commands to the container-local UNIX socket /var/run/docker.sock — just like it would on the host system. Thanks to the bind mount, this socket inside the container refers to the socket on the host system. Since containers basically™ are isolated processes, the socket can be accessed locally without any network stack being involved.

Bind-mounting the socket forwards all Docker commands to the Docker daemon running on the host. All Docker commands will be relative to the host system, which means that bind-mounts will always refer to locations on the host, not to locations in the container. Sharing a directory in the container file-system, which is not available on the host, is not possible through bind-mounts.

Docker in Docker (dind)

Unlike the passthrough method, dind actually creates a separate Docker daemon inside a docker container. Since a Docker daemon needs permission to do all the process isolation goodness (cgroups, …), the container with the Docker daemon requires to be launched with the --privileged flag.

A dind setup usually consists of two containers: The first container contains all the usual Docker CLI utilities (CLI-container) and the second container actually runs the docker daemon (daemon-container). The docker images docker:20.10.16 (CLI-container) and docker:20.10.16-dind (daemon-container) do exactly fit those requirements (hint: they must have the same version; do not use :latest or :stable). Even better — they are configured to work in conjunction! The docker:20.10.16 image automatically sets the DOCKER_HOST environment variable to tcp://docker:2375:

sh@dsh ~ docker run --rm docker:20.10.16 env | grep -i docker_host
DOCKER_HOST=tcp://docker:2375

This means that any Docker command executed inside the CLI-container will be forwarded to whatever host is known in the network as docker. That’s where the daemon-container comes into play: We start it with the name docker and put both containers in the same network. Any Docker CLI command is forwarded to the daemon-container!

sh@dsh ~ docker network create dind-test
f485ca0c22f0ab2ab8210976caa344ebf7cf30a734baeceea84c0e3e3c75ea87
sh@dsh ~ docker run --privileged -d --rm\
>   --name docker\
>   --network dind-test\
>   docker:20.10.16-dind dockerd --host=tcp://0.0.0.0:2375
1b16b379a2d215791db3bd1029595e50333dddaf4768add11d8dbd15de5c397e
sh@dsh ~ docker run --rm\
>   --network=dind-test\
>   docker:20.10.16 docker run --rm hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
2db29710123e: Pulling fs layer
2db29710123e: Download complete
2db29710123e: Pull complete
Digest: sha256:80f31da1ac7b312ba29d65080fddf797dd76acfb870e677f390d5acba9741b17
Status: Downloaded newer image for hello-world:latest

Hello from Docker!
This message shows that your installation appears to be working correctly.

...

I cheated by not using TLS certificates by passing the dockerd --host=tcp://0.0.0.0:2375 arguments (which I think is ok for a local setup). If you are serious about deploying a dind-setup, read the documentation for TLS certificates, which allows the CLI-container and daemon-container to mutually authenticate.

Configuring GitLab Runners

Now that you know about the differences between a socket passthrough and dind, let’s move on to the GitLab Runner configuration. You can setup a GitLab Runner with either method; the former being easier and containing fewer pitfalls.

The Scenario

In order to illustrate the differences, let’s assume the following scenario: You are in the process of building your next project which will be abandoned in two months (you know it). Nevertheless, you want to feel good about yourself and setup a CI/CD pipeline to build and test your application. Naturally you resort to using Docker too keep everything nice and contained (heh). /s

Since your app stores all its data at /app/data, integration tests are easy. By mounting /app/data to a directory in your git repository containing test data (e.g. ./test/int-00), you can quickly test your application in different states. To test your application, all you have to do is execute one command:

sh@dsh ~ docker run --rm -v$(pwd)/test/int-00:/app/data my-app npm test

> my-app@1.4.2 test
> jest

 PASS  src/model/Bananas.test.ts
 PASS  src/model/Apples.test.ts

Test Suites: 2 passed, 2 total
Tests:       10 passed, 10 total
Snapshots:   0 total
Time:        1.818 s
Ran all test suites.

The goal is to automate this process using a GitLab CI Docker Runner. Therefore you register a GitLab Docker Runner and connect it to your project with the gitlab-runner register command. After that you create a simple .gitlab-ci.yml file to automatically run the tests and push the Docker image to your registry.

image: docker:20.10.16

variables:
  DOCKER_REGISTRY: my-docker-registry.com
  DOCKER_REGISTRY_USER: gitlab
  DOCKER_IMAGE: ${DOCKER_REGISTRY}/my-app

before_script:
  - docker login -u ${DOCKER_REGISTRY_USER} -p ${CI_BUILD_TOKEN} ${DOCKER_REGISTRY}

build_and_test:
  script:
    - docker build -t ${DOCKER_IMAGE} -f Dockerfile .
    - docker run --rm -v$(pwd)/test/int-00:/app/data ${DOCKER_IMAGE} npm test
    - docker push ${DOCKER_IMAGE}

The only thing left to do is to tweak the GitLab Runner configuration (/etc/gitlab-runner/config.toml) that it supports running Docker commands.

If you try to run the CI pipeline with this Runner configuration, you will get a Cannot connect to the Docker daemon at tcp://docker:2375. Is the docker daemon running? error. The Runner container only contains the Docker CLI tools, not the docker daemon itself. The Runner has no way of executing docker commands.

Socket Passthrough

Let’s fix the GitLab Runner config and pass through the Docker socket into the GitLab Runner container so that the Docker CLI tools inside the container can access the host’s Docker daemon. This will allow the Docker CLI tools to connect to the host’s Docker daemon and execute Docker commands in the context of the host.

[[runners]]
  # ...
  [runners.docker]
    # ...
    tls_verify = false
    image = "docker:edge-git"
    privileged = false
    volumes = ["/var/run/docker.sock:/var/run/docker.sock"]

When we try to run our CI pipeline with this configuration, the Docker command can be executed successfully now. Our docker build command is able to execute successfully! Yet, our tests which require the test-data to be bind-mounted into the container do not seem to work since the /app/data directory inside our application container appears to be empty. What happened?

We tried to mount a directory that is only available inside the GitLab Runner container, but the Docker daemon which is running on the host mounted the directory of the host file system, which does not exist.

Diagram of a Docker socket passthrough setup

The GitLab Runner container forwards the commands to the Docker daemon running on the host. The bind-mount does not work as expected since the daemon mounts the directory in the context of the host, not in the context of the container.

How can we solve this issue? By making the build directory available to the host (where the Docker daemon is located)! By making available I mean creating a bind-mount between the host and the GitLab Runner container, so they share the build directory. But where is the build directory? The GitLab documentation has us covered:

The Docker executor by default stores all builds in /builds/<namespace>/<project-name> and all caches in /cache (inside the container). You can overwrite the /builds and /cache directories by defining the builds_dir and cache_dir options under the [[runners]] section in config.toml. This will modify where the data are stored inside the container. GitLab Docs - The Docker executor

Implicitly GitLab uses a docker volume for the /build directory, which is accessible to the GitLab Runner container only.

It would be sufficient to share the project’s build directory between the host and the container, but the static GitLab Runner configuration does not allow for that. Let’s share the whole /builds directory between our GitLab Runner and our host by using a bind mount:

# ...
volumes = ["/var/run/docker.sock:/var/run/docker.sock", "/builds:/builds"]

Now even the tests that require the test-data to be bind-mounted into the application container execute successfully! As you can see in the following picture, both — the GitLab Runner container and the application container — access the build data which resides on the host.

Diagram of a Docker socket passthrough setup and mounting the /build directory

The GitLab Runner container forwards the commands to the Docker daemon running on the host. The bind-mount works as expected since the daemon mounts the /builds directory in the GitLab Runner and the application container.

Whenever possible I suggest sticking with this shared Docker socket setup. It should be applicable to most real-life scenarios, but your mileage may vary. In case you really need a nested Docker daemon that is completely separate from the host’s daemon, read on.

Docker in Docker (dind)

Let’s remember that the dind setup consisted of two containers: The CLI-container containing all the Docker command line utilities and the daemon-container which runs the actual Docker daemon. Even though the CLI-container does not need to be privileged, as we haven seen before, in GitLab either both or none of the containers are launched privileged. Therefore we have to adjust our GitLab Runner configuration to launch the CLI-container with the privileged flag. Also we disable the TLS verification (tls_verify) between the CLI-container tools and the daemon-container daemon. In case you want to enable it, follow the instructions in the GitLab documentation.

[[runners]]
  # ...
  [runners.docker]
    tls_verify = false
    image = "docker:edge-git"
    privileged = true
    # no volumes required
    volumes = []

This takes care of the CLI-container. But how do we launch the accompanying daemon-container? GitLab CI enables you to define so called “services” that will be available when your GitLab Runner job is active. Each service that you specify in your .gitlab-ci.yml spawns a new Docker container with the specified image. To make the services accessible to the GitLab Runner container running the CI script, all containers (job container and services) are placed in the same network. They are accessible by their image name, therefore our dind service can be reached by the name docker from the job container.

image: docker:20.10.16

services:
  - docker:20.10.16-dind
# ...

Ok that takes care of starting the dind container and being able to communicate with it, but what about the problem that our test files are not available in other containers? Again, the GitLab documentation has us covered:

Since version 1.5 GitLab Runner mounts a /builds directory to all shared services. GitLab Docs - The Docker executor

To illustrate what that means:

  • The GitLab Runner container stores the cloned repository and build files in /builds
  • By default, a Docker volume is used for this directory, but it is only used in the GitLab Runner container itself
  • When declaring a services, this volume will be mounted in each service container
Diagram of a Docker dind setup

As you might have noticed, there are two connections to the private Docker registry now. One by the host’s Docker daemon and the other one by the dind container’s Docker daemon. If you are using a self-signed certificate or a client certificate authentication, the Docker daemon inside the dind container will not be able to access your registry.

One way to mitigate this is to build a custom docker:dind image which includes your required certificates. For example, if you want to add your self-signed CA certificate to the dind image, a Dockerfile might look like this:

ARG BASE_VERSION
FROM docker:${BASE_VERSION}-dind
ARG DOCKER_REGISTRY=my-docker-registry.com
RUN mkdir -p /etc/docker/certs.d/${DOCKER_REGISTRY}
COPY ca.crt /etc/docker/certs.d/${DOCKER_REGISTRY}/ca.crt
RUN cp /etc/docker/certs.d/${DOCKER_REGISTRY}/ca.crt /usr/local/share/ca-certificates/${DOCKER_REGISTRY}-ca.crt \
    && update-ca-certificates

By using this custom dind image in the service section of the .gitlab-ci.yml file, you’ll be able to connect to your registry. If you change the image name of your custom image to something different than docker, make sure to setup an alias in the service configuration to ensure the dind service remains accessible by the name docker:

services:
  - name: my-docker-registry.com/docker-with-certs:20.10.16-dind
    alias: docker

Conclusion

I had trouble setting up GitLab CI that is able to build and run Docker containers. The reason for my troubles was not rooted in a missing or faulty documentation but in my lack of knowledge of Docker dind. I hope this post makes it easier to understand what is going on under the hood in a GitLab Runner dind setup.

References

See Also