Anatomy of a Multi-Stage Docker Build

Wed 19 July 2017 by Moshe Zadka

Docker, in recent versions, has introduced multi-stage build. This allows separating the build environment from the runtime envrionment much more easily than before.

In order to demonstrate this, we will write a minimal Flask app and run it with Twisted using its WSGI support.

The Flask application itself is the smallest demo app, straight from any number of Flask tutorials:

# src/msbdemo/wsgi.py
from flask import Flask
app = Flask("msbdemo")
@app.route("/")
def hello():
    return "If you are seeing this, the multi-stage build succeeded"

The setup.py file, similarly, is the minimal one from any number of Python packaging tutorials:

import setuptools
setuptools.setup(
    name='msbdemo',
    version='0.0.1',
    url='https://github.com/moshez/msbdemo',
    author='Moshe Zadka',
    author_email='zadka.moshe@gmail.com',
    packages=setuptools.find_packages(),
    install_requires=['flask'],
)

The interesting stuff is in the Dockefile. It is interesting enough that we will go through it line by line:

FROM python:2.7.13

We start from a "fat" Python docker image -- one with the Python headers installed, and the ability to compile extensions.

RUN virtualenv /buildenv

We create a custom virtual environment for the build process.

RUN /buildenv/bin/pip install pex wheel

We install the build tools -- in this case, wheel, which will let us build wheels, and pex, which will let us build single file executables.

RUN mkdir /wheels

We create a custom directory to put all of our wheels. Note that we will not install those wheels in this docker image.

COPY src /src

We copy our minimal Flask-based application's source code into the docker image.

RUN /buildenv/bin/pip wheel --no-binary :all: \
                            twisted /src \
                            --wheel-dir /wheels

We build the wheels. We take care to manually build wheels ourselves, since pex, right now, cannot handle manylinux binary wheels.

RUN /buildenv/bin/pex --find-links /wheels --no-index \
                      twisted msbdemo -o /mnt/src/twist.pex -m twisted

We build the twisted and msbdemo wheels, togther with any recursive dependencies, into a Pex file -- a single file executable.

FROM python:2.7.13-slim

This is where the magic happens. A second FROM line starts a new docker image build. The previous images are available -- but only inside this Dockerfile -- for copying files from. Luckily, we have a file ready to copy: the output of the Pex build process.

COPY --from=0 /mnt/src/twist.pex /root

The --from=0 indicates copying from a previously built image, rather than the so-called "build context". In theory, any number of builds can take place in one Dockefile. While only the last one will actually result in a permanent image, the others are all available as targets for --from copying. In practice, two stages are usually enough.

ENTRYPOINT ["/root/twist.pex", "web", "--wsgi", "msbdemo.wsgi.app", \
            "--port", "tcp:80"]

Finally, we use Twisted as our WSGI container. Since we bound the Pex file to the -m twisted package execution, all we need to is run the web plugin, ask it to run a wsgi container, and give it the logical (module) path to our WSGI app.

Using Docker multi-stage builds has allowed us to create a Docker container for production with:

A smaller footprint (using the "slim" image as base)
Few layers (only adding two layers to the base slim image)

The biggest benefit is that it let us do so with one Dockerfile, with no extra machinery.

Categories
misc