Anatomy of a Multi-Stage Docker Build

Wed 19 July 2017 by Moshe Zadka

Docker, in recent versions, has introduced multi-stage build. This allows separating the build environment from the runtime envrionment much more easily than before.

In order to demonstrate this, we will write a minimal Flask app and run it with Twisted using its WSGI support.

The Flask application itself is the smallest demo app, straight from any number of Flask tutorials:

# src/msbdemo/wsgi.py
from flask import Flask
app = Flask("msbdemo")
@app.route("/")
def hello():
    return "If you are seeing this, the multi-stage build succeeded"

The setup.py file, similarly, is the minimal one from any number of Python packaging tutorials:

import setuptools
setuptools.setup(
    name='msbdemo',
    version='0.0.1',
    url='https://github.com/moshez/msbdemo',
    author='Moshe Zadka',
    author_email='zadka.moshe@gmail.com',
    packages=setuptools.find_packages(),
    install_requires=['flask'],
)

The interesting stuff is in the Dockefile. It is interesting enough that we will go through it line by line:

FROM python:2.7.13

We start from a "fat" Python docker image -- one with the Python headers installed, and the ability to compile extensions.

RUN virtualenv /buildenv

We create a custom virtual environment for the build process.

RUN /buildenv/bin/pip install pex wheel

We install the build tools -- in this case, wheel, which will let us build wheels, and pex, which will let us build single file executables.

RUN mkdir /wheels

We create a custom directory to put all of our wheels. Note that we will not install those wheels in this docker image.

COPY src /src

We copy our minimal Flask-based application's source code into the docker image.

RUN /buildenv/bin/pip wheel --no-binary :all: \
                            twisted /src \
                            --wheel-dir /wheels

We build the wheels. We take care to manually build wheels ourselves, since pex, right now, cannot handle manylinux binary wheels.

RUN /buildenv/bin/pex --find-links /wheels --no-index \
                      twisted msbdemo -o /mnt/src/twist.pex -m twisted

We build the twisted and msbdemo wheels, togther with any recursive dependencies, into a Pex file -- a single file executable.

FROM python:2.7.13-slim

This is where the magic happens. A second FROM line starts a new docker image build. The previous images are available -- but only inside this Dockerfile -- for copying files from. Luckily, we have a file ready to copy: the output of the Pex build process.

COPY --from=0 /mnt/src/twist.pex /root

The --from=0 indicates copying from a previously built image, rather than the so-called "build context". In theory, any number of builds can take place in one Dockefile. While only the last one will actually result in a permanent image, the others are all available as targets for --from copying. In practice, two stages are usually enough.

ENTRYPOINT ["/root/twist.pex", "web", "--wsgi", "msbdemo.wsgi.app", \
            "--port", "tcp:80"]

Finally, we use Twisted as our WSGI container. Since we bound the Pex file to the -m twisted package execution, all we need to is run the web plugin, ask it to run a wsgi container, and give it the logical (module) path to our WSGI app.

Using Docker multi-stage builds has allowed us to create a Docker container for production with:

  • A smaller footprint (using the "slim" image as base)
  • Few layers (only adding two layers to the base slim image)

The biggest benefit is that it let us do so with one Dockerfile, with no extra machinery.