README.rst.j2

This is a DRAFT (possibly including untested, yet-to-be released) edits to the blog post published here: https://www.caktusgroup.com/blog/2017/03/14/production-ready-dockerfile-your-python-django-app/

**Have a suggested tweak or fix?** Feel free to submit a PR.


A Production-ready Dockerfile for Your Python/Django App
========================================================

**Update (October 29, 2019):** I updated this post with more recent Django and Postgres versions, to use Python and pip directly in the container (instead of in a separate virtual environment, which was unnecessary), and switched to a non-root user via Docker instead of uWSGI.

Docker has matured a lot since it was released. We've been watching it closely at Caktus, and have been thrilled by the adoption -- both by the community and by service providers. As a team of Python and Django developers, we're always searching for best of breed deployment tools. Docker is a clear fit for packaging the underlying code for many projects, including the Python and Django apps we build at Caktus.

This post also includes an `accompanying GitHub repo <https://github.com/caktus/dockerfile_post/>`_.

Technical Overview
------------------

There are many ways to containerize a Python/Django app, no one of which could be considered "the best." That being said, I think the following approach provides a good balance of simplicity, configurability, and container size. The specific tools I use are: `Docker <https://www.docker.com/>`_ (of course), the `python:3.7-slim <https://hub.docker.com/_/python/>`_ Docker image (based on Debian Stretch), and `uWSGI <https://uwsgi-docs.readthedocs.io/>`_.

In a previous version of this post, I used `Alpine Linux <https://alpinelinux.org/>`_ as the base image for this
Dockerfile. But now, I'm switching to a Debian- and glibc-based image, because I found `an inconvenient workaround <https://github.com/iron-io/dockers/issues/42#issuecomment-290763088>`_ was required for `musl libc <https://www.musl-libc.org/>`_'s ``strftime()`` implementation. The Debian-based "slim" images are still relatively small; the bare-minimum image described below increases from about 170 MB to 227 MB (~33%) when switching from ``python:3.7-alpine`` to ``python:3.7-slim`` (and updating all the corresponding system packages).

There are many WSGI servers available for Python, and we use both Gunicorn and uWSGI at Caktus. A couple of the benefits of uWSGI are that (1) it's almost entirely configurable through environment variables (which fits well with containers), and (2) it includes `native HTTP support <http://uwsgi-docs.readthedocs.io/en/latest/HTTP.html#can-i-use-uwsgi-s-http-capabilities-in-production>`_, which can circumvent the need for a separate HTTP server like Apache or Nginx.


The Dockerfile
--------------

Without further ado, here's a production-ready ``Dockerfile`` you can use as a starting point for your project (it should be added in your top level project directory, next to the ``manage.py`` script provided by your Django project):

.. code-block:: docker
{# re-comment out the lines that we've uncommented in the sample files, for testing purposes -#}
{% filter replace("ENTRYPOINT", "# ENTRYPOINT") -%}
{% filter replace("ENV UWSGI_ROUTE_HOST", "# ENV UWSGI_ROUTE_HOST") -%}
{% filter indent(width=4) %}
{% include "Dockerfile" %}
{%- endfilter %}
{%- endfilter %}
{%- endfilter %}
We extend from the "slim" flavor of the official Docker image for Python 3.7, install a few dependencies for running our application (i.e., that we want to keep in the final version of the image), copy the folder containing our requirements files to the container, and then, in a single line, (a) install the build dependencies needed, (b) ``pip install`` the requirements themselves (edit this line to match the location of your requirements file, if needed), (c) remove the C compiler and any other OS packages no longer needed, and (d) remove the package lists since they're no longer needed. It's important to keep this all on one line so that Docker will cache the entire operation as a single layer.

Next, we copy our application code to the image, set some default environment variables, and run ``collectstatic``. Be sure to change the values for ``DJANGO_SETTINGS_MODULE`` and ``UWSGI_WSGI_FILE`` to the correct paths for your application (note that the former requires a Python package path, while the latter requires a file system path).

A few notes about other aspects of this Dockerfile:

* I only included a minimal set of OS dependencies here. If this is an established production app, you'll most likely need to visit https://packages.debian.org, search for the Debian package names of the OS dependencies you need, including the ``-dev`` supplemental packages as needed, and add them either to ``RUN_DEPS`` or ``BUILD_DEPS`` in your Dockerfile.
* Adding ``--no-cache-dir`` to the ``pip install`` command saves a additional disk space, as this prevents ``pip`` from `caching downloads <https://pip.pypa.io/en/stable/reference/pip_install/#caching>`_ and `caching wheels <https://pip.pypa.io/en/stable/reference/pip_install/#wheel-cache>`_ locally. Since you won't need to install requirements again after the Docker image has been created, this can be added to the ``pip install`` command. Thanks Hemanth Kumar for this tip!
* uWSGI contains a lot of optimizations for running many apps from the same uWSGI process. These optimizations aren't really needed when running a single app in a Docker container, and can `cause issues <https://discuss.newrelic.com/t/newrelic-agent-produces-system-error/43446/2>`_ when used with certain 3rd-party packages. I've added ``UWSGI_LAZY_APPS=1`` and ``UWSGI_WSGI_ENV_BEHAVIOR=holy`` to the uWSGI configuration to provide a more stable uWSGI experience (the latter will be the default in the next uWSGI release).
* The ``UWSGI_HTTP_AUTO_CHUNKED`` and ``UWSGI_HTTP_KEEPALIVE`` options to uWSGI are needed in the event the container will be hosted behind an Amazon Elastic Load Balancer (ELB), because Django doesn't set a valid ``Content-Length`` header by default, unless the ``ConditionalGetMiddleware`` is enabled. See `the note <http://uwsgi-docs.readthedocs.io/en/latest/HTTP.html#can-i-use-uwsgi-s-http-capabilities-in-production>`_ at the end of the uWSGI documentation on HTTP support for further detail.


Requirements and Settings Files
-------------------------------

Production-ready requirements and settings files are outside the scope of this post, but you'll need to include a few things in your requirements file(s), if they're not there already::
{% filter indent(width=4) %}
{% include "requirements.txt" %}
{%- endfilter %}
I didn't pin these to specific versions here to help future-proof this post somewhat, but you'll likely want to pin these (and other) requirements to specific versions so things don't suddenly start breaking in production. Of course, you don't have to use any of these packages, but you'll need to adjust the corresponding code elsewhere in this post if you don't.

My ``deploy.py`` settings file looks like this:

.. code-block:: python
{% filter indent(width=4) %}
{% include "my_project/settings/deploy.py" %}
{%- endfilter %}
This bears repeating: This is **not** a production-ready settings file, and you should review `the checklist <https://docs.djangoproject.com/en/dev/howto/deployment/checklist/>`_ in the Django docs (and run ``python manage.py check --deploy --settings=my_project.settings.deploy``) to ensure you've properly secured your production settings file.


Building and Testing the Container
----------------------------------

Now that you have the essentials in place, you can build your Docker image locally as follows:

.. code-block:: bash

    docker build -t my-app .

This will go through all the commands in your Dockerfile, and if successful, store an image with your local Docker server that you could then run:

.. code-block:: bash

    docker run -e DATABASE_URL='' -t my-app

This command is merely a smoke test to make sure uWSGI runs, and won't connect to a database or any other external services.


Running Commands During Container Start-Up
------------------------------------------

As a final step, I recommend creating an ``ENTRYPOINT`` script to run commands as needed during container start-up. This will let us accomplish any number of things, such as making sure Postgres is available or running ``migrate`` during container start-up. Save the following to a file named ``docker-entrypoint.sh`` in the same directory as your ``Dockerfile``:

.. code-block:: bash
{% filter indent(width=4) %}
{% include "docker-entrypoint.sh" %}
{%- endfilter %}
Make sure this file is executable, i.e.:

.. code-block:: bash

    chmod a+x docker-entrypoint.sh

Next, uncomment the following line to your ``Dockerfile``, just above the ``CMD`` statement:

.. code-block:: docker

    ENTRYPOINT ["/code/docker-entrypoint.sh"]

This will (a) make sure a database is available (usually only needed when used with Docker Compose) and (b) run outstanding migrations, if any, if the ``DJANGO_MANAGEPY_MIGRATE`` is set to ``on`` in your environment. Even if you add this entrypoint script as-is, you could still choose to run ``migrate`` or ``collectstatic`` in separate steps in your deployment before releasing the new container. The only reason you might not want to do this is if your application is highly sensitive to container start-up time, or if you want to avoid any database calls as the container starts up (e.g., for local testing). If you do rely on these commands being run during container start-up, be sure to set the relevant variables in your container's environment.


Creating a Production-Like Environment Locally with Docker Compose
------------------------------------------------------------------

To run a complete copy of production services locally, you can use `Docker Compose <https://docs.docker.com/compose/>`_. The following ``docker-compose.yml`` will create a barebones, ephemeral, AWS-like container environment with Postgres for testing your production environment locally.

*This is intended for local testing of your production environment only, and will not save data from stateful services like Postgres upon container shutdown.*

.. code-block:: yaml
{% filter indent(width=4) %}
{% include "docker-compose.yml" %}
{%- endfilter %}
Copy this into a file named ``docker-compose.yml`` in the same directory as your ``Dockerfile``, and then run:

.. code-block:: bash

    docker-compose up --build -d

This downloads (or builds) and starts the two containers listed above. You can view output from the containers by running:

.. code-block:: bash

    docker-compose logs

If all services launched successfully, you should now be able to access your application at http://localhost:8000/ in a web browser.

If you need to debug your application container, a handy way to launch an instance it and poke around is:

.. code-block:: bash

    docker-compose run app /bin/bash


Static Files
------------

You may have noticed that we set up static file serving in uWSGI via the ``UWSGI_STATIC_MAP`` and ``UWSGI_STATIC_EXPIRES_URI`` environment variables. If preferred, you can turn this off and use `Django Whitenoise <http://whitenoise.evans.io/en/stable/>`_ or `copy your static files straight to S3 <https://www.caktusgroup.com/blog/2014/11/10/Using-Amazon-S3-to-store-your-Django-sites-static-and-media-files/>`_.


Blocking ``Invalid HTTP_HOST header`` Errors with uWSGI
-------------------------------------------------------

To avoid Django's ``Invalid HTTP_HOST header`` errors (and prevent any such spurious requests from taking up any more CPU cycles than absolutely necessary), you can also configure uWSGI to return an ``HTTP 400`` response immediately without ever invoking your application code. This can be accomplished by uncommenting and customizing the ``UWSGI_ROUTE_HOST`` line in the Dockerfile above.


Summary
-------

That concludes this high-level introduction to containerizing your Python/Django app for hosting on AWS Elastic Beanstalk (EB), Elastic Container Service (ECS), or elsewhere. Each application and Dockerfile will be slightly different, but I hope this provides a good starting point for your containers. Shameless plug: if you're looking for a simple (and at least temporarily free) way to test your Docker containers on AWS using an Elastic Beanstalk Multicontainer Docker environment or the Elastic Container Service, check out Caktus' very own `AWS Web Stacks <https://github.com/caktus/aws-web-stacks>`_. Good luck!