Development

The road to Docker, Django and Amazon ECS, part 5

For part 1

For part 2

For part 3

For part 4

Making the container able to serve its own static media files

Since each container will be behind a CDN or other caching proxy, we want each container able to serve its own static media files. We are using Whitenoise, a Python WSGI middleware designed specifically for serving static media.

Installing Whitenoise

The installation of Whitenoise was pretty easy:

Update requirements. Add whitenoise==3.2.2 and brotlipy==0.6.0 to requirements.txt.

brotlipy enables the latest type of compression brotli.

Add it to MIDDLEWARE_CLASSES. Add 'whitenoise.middleware.WhiteNoiseMiddleware' nearly to the top of the list (right under the Django security middleware).

This middleware should intercept the request and deal with it, if the request is for a static file it knows about.

Change STATICFILES_STORAGE. Change it to 'whitenoise.storage.CompressedManifestStaticFilesStorage'.

This does everything that staticfiles's ManifestStaticFilesStorage class does, but also compresses each file using Gzip and Brotli.

Environmentalize the STATIC_HOST setting. Since this setting can change from project to project and environment to environment, we need to make it easy:

STATIC_HOST = env('DJANGO_STATIC_HOST', default='//media.nationalgeographic.org/')
STATIC_URL = os.path.join(STATIC_HOST, '/static/')

We also modified the docker-run.sh script we are using to test it locally, adding -e DJANGO_STATIC_HOST=http://localhost/ so it will look to the container for static media.

The three problems you meet on the way

Halt!

Right after getting Whitenoise installed we built and ran the container, expecting glorious results. We got gunicorn.errors.HaltServer: <HaltServer 'Worker failed to boot.' 3>

WTF!? This was just working! Even after backing out all the changes made to install Whitenoise, we got the error. To make it even worse, we couldn't replicate it outside of the container.

Lock and --preload We tried using gunicorn's --preload option to force it to load the entire code into memory. Reports say that this can sometimes give you an idea of what is going on. It did. We then got:

Error creating new content types. Please make sure contenttypes 
is migrated before trying to migrate apps individually.

But no stack trace or any indication where to look further.

Waitress, check please! After quite a bit of frustration with this, I came upon this article that says we shouldn't even be using gunicorn in docker (well, Heroku). We should, instead, use waitress.

So with waitress installed, we ran:

waitress-serve --port 8000 --expose-tracebacks conf.docker_wsgi:application

Notice the --expose-tracebacks parameter? Yea! Now we got a proper traceback. The result was really anticlimactic. We had just merged the master branch into this one to keep it up-to-date and recent code change caused this strange error. It took 20 minutes to fix, but hours and hours to find.

As a result, we are going to keep going with waitress.

Rethinking the build process

Our initial thoughts on the production build process was that our CI/CD server would do it. We wanted the process to be as isolated as possible. After implementing whitenoise, we changed our minds.

Whitenoise compresses the files each time you collect static media, so it takes significantly longer to start up. Also, Docker's image doesn't save any changes made to it while its running, so all the compression and stuff done to the static media files isn't saved between runs.

We will have to have a test database available for use by our build server and do a local deploy so we can collect all the static media before we build the image. Then we will copy the compiled static media to the image. This makes the docker image bigger, but faster to run each time.

We made a build script to make it easier locally:

#!/bin/bash
echo "Removing old images"
docker rmi `docker images | grep "^<none>" | awk '{print $3}'`

echo "Collecting static files"
./manage.py collectstatic --noinput --verbosity 1

echo "Concatenating CSS and JS"
./manage.py compress --force --verbosity 1

docker build -t ngs:latest .

And we removed staticmedia from the .dockerignore file so the staticmedia directory will get copied to the image.

Why won't you take cache?

With the first two problems solved, we were back in business. The web site came up, static media and all. When we looked at the headers for the static files, we saw some HTTP headers we didn't want, and didn't expect:

Cache-Control:no-cache, no-store, must-revalidate, max-age=0

Whitenoise is updates the caching headers. Something else was altering them. While we sent that bug off to have someone else find it, we simply used whitenoise as WSGI middleware:

import os
import sys

from whitenoise import WhiteNoise
from django.core.wsgi import get_wsgi_application

sys.stdout = sys.stderr
application = get_wsgi_application()
PROJECT_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), '..'))
application = WhiteNoise(application, root=os.path.join(PROJECT_ROOT, 'staticmedia'), prefix='static/')

That made everything work as expected.

Next Time

Next time we'll see what it takes to get this Docker image into ECS and connected with other services.

The road to Docker, Django and Amazon ECS, part 4

For part 1

For part 2

For part 3

Putting together a Dockerfile

I couldn't wait any longer, so I wanted to see it running in Docker!

Choosing a Linux distrobution

We want the absolute smallest container we can get to run our project. The container is going to run Linux. We currently have Ubuntu on our servers, but default Ubuntu includes lots of stuff we don't need.

We chose Alpine Linux because it was small and had a large set of packages to install.

Setting up the Dockerfile

We based our Dockerfile on João Ferreira Loff's Alpine Linux Python 2.7 slim image.

FROM alpine:3.5

# Install needed packages. Notes:
#   * dumb-init: a proper init system for containers, to reap zombie children
#   * musl: standard C library
#   * linux-headers: commonly needed, and an unusual package name from Alpine.
#   * build-base: used so we include the basic development packages (gcc)
#   * bash: so we can access /bin/bash
#   * git: to ease up clones of repos
#   * ca-certificates: for SSL verification during Pip and easy_install
#   * python2: the binaries themselves
#   * python2-dev: are used for gevent e.g.
#   * py-setuptools: required only in major version 2, installs easy_install so we can install Pip.
#   * build-base: used so we include the basic development packages (gcc)
#   * linux-headers: commonly needed, and an unusual package name from Alpine.
#   * python-dev: are used for gevent e.g.
#   * postgresql-client: for accessing a PostgreSQL server
#   * postgresql-dev: for building psycopg2
#   * py-lxml: instead of using pip to install lxml, this is faster. Must make sure requirements.txt has correct version
#   * libffi-dev: for compiling Python cffi extension
#   * tiff-dev: For Pillow: TIFF support
#   * jpeg-dev: For Pillow: JPEG support
#   * openjpeg-dev: For Pillow: JPEG 2000 support
#   * libpng-dev: For Pillow: PNG support
#   * zlib-dev: For Pillow:
#   * freetype-dev: For Pillow: TrueType support
#   * lcms2-dev: For Pillow: Little CMS 2 support
#   * libwebp-dev: For Pillow: WebP support
#   * gdal: For some Geo capabilities
#   * geos: For some Geo capabilities
ENV PACKAGES="\
  dumb-init \
  musl \
  linux-headers \
  build-base \
  bash \
  git \
  ca-certificates \
  python2 \
  python2-dev \
  py-setuptools \
  build-base \
  linux-headers \
  python-dev \
  postgresql-client \
  postgresql-dev \
  py-lxml \
  libffi-dev \
  tiff-dev \
  jpeg-dev \
  openjpeg-dev \
  libpng-dev \
  zlib-dev \
  freetype-dev \
  lcms2-dev \
  libwebp-dev \
  gdal \
  geos \
"

RUN echo \
  # replacing default repositories with edge ones
  && echo "http://dl-cdn.alpinelinux.org/alpine/edge/testing" > /etc/apk/repositories \
  && echo "http://dl-cdn.alpinelinux.org/alpine/edge/community" >> /etc/apk/repositories \
  && echo "http://dl-cdn.alpinelinux.org/alpine/edge/main" >> /etc/apk/repositories \

  # Add the packages, with a CDN-breakage fallback if needed
  && apk add --no-cache $PACKAGES || \
    (sed -i -e 's/dl-cdn/dl-4/g' /etc/apk/repositories && apk add --no-cache $PACKAGES) \

  # make some useful symlinks that are expected to exist
  && if [[ ! -e /usr/bin/python ]];        then ln -sf /usr/bin/python2.7 /usr/bin/python; fi \
  && if [[ ! -e /usr/bin/python-config ]]; then ln -sf /usr/bin/python2.7-config /usr/bin/python-config; fi \
  && if [[ ! -e /usr/bin/easy_install ]];  then ln -sf /usr/bin/easy_install-2.7 /usr/bin/easy_install; fi \

  # Install and upgrade Pip
  && easy_install pip \
  && pip install --upgrade pip \
  && if [[ ! -e /usr/bin/pip ]]; then ln -sf /usr/bin/pip2.7 /usr/bin/pip; fi \
  && echo

# Chaining the ENV allows for only one layer, instead of one per ENV statement
ENV HOMEDIR=/code \
    LANG=en_US.UTF-8 \
    LC_ALL=en_US.UTF-8 \
    PYTHONUNBUFFERED=1 \
    NEW_RELIC_CONFIG_FILE=$HOMEDIR/newrelic.ini \
    GUNICORNCONF=$HOMEDIR/conf/docker_gunicorn_conf.py \
    GUNICORN_WORKERS=2 \
    GUNICORN_BACKLOG=4096 \
    GUNICORN_BIND=0.0.0.0:8000 \
    GUNICORN_ENABLE_STDIO_INHERITANCE=True \
    DJANGO_SETTINGS_MODULE=settings

WORKDIR $HOMEDIR

# Copying this file over so we can install requirements.txt in one cache-able layer
COPY requirements.txt $HOMEDIR/
RUN pip install --upgrade pip \
  && pip install -r $HOMEDIR/requirements.txt

# Copy the code
COPY . $HOMEDIR

EXPOSE 8000
CMD ["sh", "-c", "$HOMEDIR/docker-entrypoint.sh"]

The first change that we made was to use Alpine Linux version 3.5, which has just been released.

Next we listed all the OS-level packages we'll need in the PACKAGES environment variable.

The next RUN statement sets the package repositories to the edge version, installs the packages in PACKAGES, creates a few convenience symlinks, and installs pip for our Python installs.

We set up all the environment variables next.

After setting the working directory, we copy our requirements.txt file into the container and install all our requirements. We do this step separately so it creates a cached layer that won't change unless the requirements.txt file changes. This saves tons of time if you keep building and re-building the image.

We copy all our code over to the container, tell the container to expose port 8000 and specify the command to run (unless we specify a different command at runtime).

You'll notice that the command looks strange. Because of the way that Docker executes the commands, it can't substitute the environment variable HOMEDIR. So we have to actually prefix our command $HOMEDIR/docker-entrypoint.sh with sh -c.

But there's something missing

You'll notice in this version, there isn't any environment variables for the database, cache, or any other variables we set up earlier. We'll get them in there eventually, but for right now, we want to see if we can build and run this container and have it connect to our local database and cache.

If you build it, it can run

Building the docker image is really easy:

docker build -t ngs:latest .

This tags this built image as ngs:latest, which isn't what we are going to do in production, but it helps when testing everything.

The output looks something like this:

$ docker build -t ngs:latest .
Sending build context to Docker daemon 76.43 MB
Step 1 : FROM alpine:3.5
 ---> 88e169ea8f46
Step 2 : ENV PACKAGES "  dumb-init   musl   linux-headers   build-base   bash   git   ca-certificates   python2   python2-dev   py-setuptools   build-base   linux-headers   python-dev   postgresql-client   postgresql-dev   py-lxml   libffi-dev   tiff-dev   jpeg-dev   openjpeg-dev   libpng-dev   zlib-dev   freetype-dev   lcms2-dev   libwebp-dev   gdal   geos "
 ---> Using cache
 ---> 184f9b7e79f9
Step 3 : RUN echo   && echo "http://dl-cdn.alpinelinux.org/alpine/edge/testing" > /etc/apk/repositories   && echo "http://dl-cdn.alpinelinux.org/alpine/edge/community" >> /etc/apk/repositories   && echo "http://dl-cdn.alpinelinux.org/alpine/edge/main" >> /etc/apk/repositories   && apk add --no-cache $PACKAGES ||     (sed -i -e 's/dl-cdn/dl-4/g' /etc/apk/repositories && apk add --no-cache $PACKAGES)   && if [[ ! -e /usr/bin/python ]];        then ln -sf /usr/bin/python2.7 /usr/bin/python; fi   && if [[ ! -e /usr/bin/python-config ]]; then ln -sf /usr/bin/python2.7-config /usr/bin/python-config; fi   && if [[ ! -e /usr/bin/easy_install ]];  then ln -sf /usr/bin/easy_install-2.7 /usr/bin/easy_install; fi   && easy_install pip   && pip install --upgrade pip   && if [[ ! -e /usr/bin/pip ]]; then ln -sf /usr/bin/pip2.7 /usr/bin/pip; fi   && echo
 ---> Using cache
 ---> 514dcc2f010d
Step 4 : ENV HOMEDIR /code LANG en_US.UTF-8 LC_ALL en_US.UTF-8 PYTHONUNBUFFERED 1 NEW_RELIC_CONFIG_FILE $HOMEDIR/newrelic.ini GUNICORNCONF $HOMEDIR/conf/docker_gunicorn_conf.py GUNICORN_WORKERS 2 GUNICORN_BACKLOG 4096 GUNICORN_BIND 0.0.0.0:8000 GUNICORN_ENABLE_STDIO_INHERITANCE True DJANGO_SETTINGS_MODULE settings
 ---> Running in 2d58f77c0a8e
 ---> 1342bb501c0f
Removing intermediate container 2d58f77c0a8e
Step 5 : WORKDIR $HOMEDIR
 ---> Running in a20a2fa64d2e
 ---> df977d30491c
Removing intermediate container a20a2fa64d2e
Step 6 : COPY requirements.txt $HOMEDIR/
 ---> e6ae37797b36
Removing intermediate container 820e3406fb5c
Step 7 : RUN pip install --upgrade pip   && pip install -r $HOMEDIR/requirements.txt
 ---> Running in 4c65be60af03
Requirement already up-to-date: pip in /usr/lib/python2.7/site-packages/pip-9.0.1-py2.7.egg
Collecting beautifulsoup4==4.5.1 (from -r /code/requirements.txt (line 2))
  Downloading beautifulsoup4-4.5.1-py2-none-any.whl (83kB)
Collecting cmsplugin-forms-builder==1.1.1 (from -r /code/requirements.txt (line 3))
...
Installing collected packages: beautifulsoup4, Django, ...
  Running setup.py install for future: started
    Running setup.py install for future: finished with status 'done'
  Installing from a newer Wheel-Version (1.1)
  Running setup.py install for unidecode: started
    Running setup.py install for unidecode: finished with status 'done'
Successfully installed Django-1.8.15 Fabric-1.10.2 ...
 ---> 165f7ae9507e
Removing intermediate container 4c65be60af03
Step 8 : COPY . $HOMEDIR
 ---> 1058d14b462f
Removing intermediate container 55f77f2e60d6
Step 9 : EXPOSE 8000
 ---> Running in 38e8c650a529
 ---> 7c53dcf41f2a
Removing intermediate container 38e8c650a529
Step 10 : CMD sh -c $HOMEDIR/docker-entrypoint.sh
 ---> Running in 1b8781bf6458
 ---> a255a40e30b8
Removing intermediate container 1b8781bf6458
Successfully built a255a40e30b8

I've truncated most of the output from installing the Python dependencies. If I run it again, steps 6 and 7 use the existing cache:

Step 6 : COPY requirements.txt $HOMEDIR/
 ---> Using cache
 ---> e6ae37797b36
Step 7 : RUN pip install --upgrade pip   && pip install -r $HOMEDIR/requirements.txt
 ---> Using cache
 ---> 165f7ae9507e

If I make changes to any other part of our project, steps 1-7 use the cache, and it only has to copy over the new code.

How big is it?

So how big is the container? Running docker images gives us:

REPOSITORY             TAG                 IMAGE ID            CREATED             SIZE
ngs                    latest              a255a40e30b8        11 minutes ago      590.1 MB

So 590.1 MB. What makes up that space? We can take a look at the layers created by our Dockerfile. Running docker history ngs:latest returns:

IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
a255a40e30b8        7 minutes ago       /bin/sh -c #(nop)  CMD ["sh" "-c" "$HOMEDIR/d   0 B
7c53dcf41f2a        7 minutes ago       /bin/sh -c #(nop)  EXPOSE 8000/tcp              0 B
1058d14b462f        7 minutes ago       /bin/sh -c #(nop) COPY dir:0da094a2328f4e5bfb   73.69 MB
165f7ae9507e        7 minutes ago       /bin/sh -c pip install --upgrade pip   && pip   227.1 MB
e6ae37797b36        11 minutes ago      /bin/sh -c #(nop) COPY file:25e352c295f212113   3.147 kB
df977d30491c        11 minutes ago      /bin/sh -c #(nop)  WORKDIR /code                0 B
1342bb501c0f        11 minutes ago      /bin/sh -c #(nop)  ENV HOMEDIR=/code LANG=en_   0 B
514dcc2f010d        3 days ago          /bin/sh -c echo   && echo "http://dl-cdn.alpi   285.3 MB
184f9b7e79f9        3 days ago          /bin/sh -c #(nop)  ENV PACKAGES=  dumb-init     0 B
88e169ea8f46        6 days ago          /bin/sh -c #(nop) ADD file:92ab746eb22dd3ed2b   3.984 MB

At the bottom layer is the Alpine Linux 3.5 distro, which is only 3.984 MB. Our OS-level packages take up 285.3 MB. Our Python dependencies take up 227.1 MB. Our code is 73.69 MB.

Make it run! Make it run!

We want this container to connect to resources running on our local computer.

Make PostgreSQL and Redis listen more

My default installation of redis and PostgreSQL only listen for connections on the loopback address. I modified them to listen on every interface.

Now my container will be able to connect to them.

Give the container the address

The container has no idea where it is running. Typically all the connections are made when Docker sets up the containers (and that is what we want, eventually). We need to inform the container on where it is running.

We are going to do this with a temporary script called docker-run.sh

#!/bin/bash
export DOCKERHOST=$(ifconfig | grep -E "([0-9]{1,3}\.){3}[0-9]{1,3}" | grep -v 127.0.0.1 | awk '{ print $2 }' | cut -f2 -d: | head -n1)
docker rm ngs-container
docker run -ti \
    -p 8000:8000 \
    --add-host dockerhost:$DOCKERHOST \
    --name ngs-container \
    -e DATABASE_URL=postgresql://coordt:password@dockerhost:5432/education \
    -e CACHE_URL=rediscache://dockerhost:6379/0?CLIENT_CLASS=site_ext.cacheclient.GracefulClient \
    ngs:latest

The first line sets DOCKERHOST environment variable to the local computer's current IP address.

The second line removes any existing containers named ngs-container. Note: Docker doesn't clean up after itself very well. This is very well known, and there are several different solutions, I'm sure. After doing some Docker building and running, you end up with lots of unused images and containers. This script attempts to remove old containers by naming the container ngs-container each time.

The last line tells docker to run the ngs:latest image, with a pseudo-tty and interactivity (-ti), map container port 8000 to local port 8000 (-p 8000:8000), adds dockerhost to the container's /etc/hosts file with the local computer's current IP address (--add-host dockerhost:$DOCKERHOST), name the container ngs-container (--name ngs-container), and sets the DATABASE_URL and CACHE_URL environment variables.

Now, make docker-run.sh executable with a chmod a+x, and you can run it.

$ ./docker-run.sh
Copying '/code/static/concepts/jquery-textext.js'
Copying '/code/static/autocomplete_light/addanother.js'
...
Post-processed 'js/tiny_mce/plugins/inlinepopups/skins/clearlooks2/img/alert.gif' as 'js/tiny_mce/plugins/inlinepopups/skins/clearlooks2/img/alert.568d4cf84413.gif'
Post-processed 'js/tiny_mce/plugins/inlinepopups/skins/clearlooks2/img/corners.gif' as 'js/tiny_mce/plugins/inlinepopups/skins/clearlooks2/img/corners.55298b5baaec.gif'
...
4256 static files copied to '/code/staticmedia', 4256 post-processed.
Operations to perform:
  Synchronize unmigrated apps: redirects, ...
  Apply all migrations: teachingatlas, ...
Synchronizing apps without migrations:
  Creating tables...
    Running deferred SQL...
  Installing custom SQL...
Running migrations:
  No migrations to apply.

If you remember from a previous post, the docker-entrypoint.sh runs two commands before it starts gunicorn.

The first is collecting (and post-processing) the static media. I've truncated the output for copying and the post-processing of said static media, but you can see that it ran.

The next is a database migration. I've truncated the output somewhat, but can see that nothing was required to migrate.

Now when I try http://localhost:8000, I get a web page! Success!

Next time

In the next installment I'll get the container serving its own static files.

The road to Docker, Django and Amazon ECS, part 3

For part 1

For part 2

Gunicorn in Docker

With Docker, our Django project is the only thing running. This means that we don't need gunicorn running in daemon mode and logging to files. We need gunicorn running in the foreground and logging to the console.

Setting up the gunicorn configuration

I found a great article with instructions on using gunicorn in Docker.

In summary, our Docker-specific gunicorn configuration is:

import os

for k, v in os.environ.items():
    if k.startswith("GUNICORN_"):
        key = k.split('_', 1)[1].lower()
        locals()[key] = v

This allows us to specify the gunicorn configuration using environment variables.

Setting up the gunicorn logging

The same article includes instructions about setting up logging. We added json-logging-py==0.2 to requirements.txt so the required library is included. Chances are, we will be changing all our logging to use this.

We added a docker_gunicorn_conf.py file to the repo:

[loggers]
keys=root, gunicorn.error

[handlers]
keys=console

[formatters]
keys=json

[logger_root]
level=INFO
handlers=console

[logger_gunicorn.error]
level=ERROR
handlers=console
propagate=0
qualname=gunicorn.error

[handler_console]
class=StreamHandler
formatter=json
args=(sys.stdout, )

[formatter_json]
class=jsonlogging.JSONFormatter

Environment variables

We know that we need to set at least these variables:

  • GUNICORNCONF
  • GUNICORN_WORKERS
  • GUNICORN_BACKLOG
  • GUNICORN_BIND
  • GUNICORN_ENABLE_STDIO_INHERITANCE

We will set their values in the Dockerfile

Running gunicorn

Spoiler alert! We will need a script that Docker will run as soon as it starts up the container. We want it to start gunicorn (and do a few other things), so we will create a new script: docker-entrypoint.sh

#!/bin/bash
python $HOMEDIR/manage.py collectstatic --noinput
python $HOMEDIR/manage.py migrate --noinput

#exec newrelic-admin run-program \
gunicorn \
    --log-config $HOMEDIR/conf/gunicorn_logging.conf \
    --config $HOMEDIR/conf/docker_gunicorn_conf.py \
    conf.wsgi:application

We chmod a+x docker-entrypoint.sh to make it executable.

In case you haven't guessed, $HOMEDIR is set in the Dockerfile.

Why are you running the collectstatic command here? Good question! You should be proud of yourself for the amount of awareness you maintain.

Initially the collectstatic command was in the Dockerfile, so it could be a part of the Docker image. However, whether it is by design, because of some of our code, or a third-party app, the command loads all the settings and apps and attempts to connect to the database when it runs. We can't have that.

At least it doesn't take long to run, and doesn't do anything if it has already run.

I'll explain more next time when I cover the Dockerfile creation.

Why are you running the migrate command here? Mostly because most of the example configurations for Docker and Django do this and I can't think of a better place to do it. Like the collectstatic command, it really only needs to be run once for the image, no matter how many machines use it. And also like the collectstatic command, we don't (necessarily) have a connection to the database to do this.

Also, it is a bit safer. In the worst case, a migration hasn't been applied, and the container will take a little longer to run the first time. In the best case, it doesn't need to do anything and starts up right away.

Next time

We'll conver creating the initial Dockerfile and testing it out to see if what we have done works.

The road to Docker, Django and Amazon ECS, part 2

For part 1

Refactoring settings

Environment variable-based settings are a big thing in the 12-factor app. I have found this incredibly confusing because you have to store the configurations somewhere else besides the ephemeral environment. Those environment variables get populated from somewhere, and I haven’t seen a good example of how to manage these across the development lifecycle.

So I’m making it up and documenting it here.

Also, another goal is deploy these changes before we fully convert to Docker. There will be some changes that are necessary for now, but will no longer necessary in the future.

If you don’t like it, why do it?

It’s not that I don’t like environment variable-based settings. It is that they never really made sense to me in the development lifecycle we use. However, the introduction of Docker changes things a bit.

In our existing setup, there isn't a clear cut place to set all the environment variables and make it easy to manage them. With Docker, it is clear that they should go in the Dockerfile.

So now we have three ways to set up the configuration of an environment:

  1. Environment variables for basic settings
  2. A specific settings file for complex settings
  3. A .env file that we could programmatically add in a deployment script

This gives us lots of flexibility to configure local development, user testing, automated testing (CI/CD), and production environments.

The tool: django-environ

We added Django environ to our dependency list. It is pretty straightforward to use.

Here’s what we did:

Add .env to .gitignore

Django environ allows you to store environmental variables in a .env file. We don’t want this stored in our repo, so we need to tell git to ignore it.

Add to requirements

We updated our requirements.txt file with django-environ==0.4.1.

Modify our settings structure

Like many Django projects, our settings is a directory:

settings/
 ├ __init__.py
 ├ base.py
 ├ local_settings.template.py
 ├ production.py
 ├ test.template.py
 └ vagrant.py

A couple of notes:

__init__.py This attempts to import local_settings.py if it exists. Otherwise imports base.py.

base.py All of our settings are in here.

local_settings.template.py A new developer will copy this file to local_settings.py for local development. It allows for the manipulation of the core settings in ways that are advantageous for development. local_settings.py is ignored by git.

We don’t plan on stopping using a local settings file, although we may use it less.

production.py This file contains production-specific settings.

We won’t need this file any more.

test.template.py This template file is used to create test.py when we generate a test instance.

We may not need this any more. When we decide on if we will change anything on how our test server is set up, we will know.

vagrant.py This is for developers using vagrant for development. Not all the developers use vagrant, and we don’t force them to.

This will probably still be used.

Modify settings

At the top of settings/base.py we have:

import environ

PROJECT_ROOT = environ.Path(__file__) - 2  # two folders back (/project/settings/base.py - 2 = /)
env = environ.Env()

if os.path.exists(PROJECT_ROOT('.env')):
    env.read_env(PROJECT_ROOT('.env'))

This snippet sets up the PROJECT_ROOT variable used to modify paths throughout the file, and then checks for a .env file and reads it.

We do this check because if you call read_env() and the file doesn’t exist, it raises a UserWarning. We don't currently plan on using a .env in production, so this warning will appear every time gunicorn starts or restarts. This is just noise, so we do a manual check for the .env file first.

Here are the settings we changed initially:

DEBUG = env('DEBUG', default=False)

DATABASES = {
    'default': env.db_url(default="postgresql://postgres:postgres@localhost:5432/education")
}

CACHES = {
    'default': env.cache_url(default='dummycache://')
}

# This may change when we decide on our media handling
SFTP_STORAGE_HOST = env.dict('SFTP_STORAGE_HOST', default='prod-cache-01')
SFTP_STORAGE_PARAMS = env.dict('SFTP_STORAGE_PARAMS', default={'username': 'natgeo'})

# Google Analytics information
GAQ_ACCOUNT = env('GAQ_ACCOUNT', default='UA-xxxxxxx-x’)
GAQ_DOMAIN = env('GAQ_DOMAIN', default=SITE_URL)

# Elasticsearch information
ES_HOST = [env('ES_HOST', default='localhost')]
ES_PORT = env('ES_PORT', default='9200')
ES_INDEX = env('ES_INDEX', default='ngs')

But what about the SECRET_KEY? In Django, the SECRET_KEY is an important security value. So why isn't it included in the environment variables? Because everyone needs it and it never changes. Any environment without that key will not be able to log users in (using a recent backup of the production database).

So if there are better ways to more securely handle the use and distribution of the SECRET_KEY, I am all ears. Or mostly ears and a lot of forehead.

Environment variables and types

Here is a couple of things we had to figure out, or at least weren't very clear in the documentation:

Want some extra options in your cache settings?

After the URL, include the extra OPTIONS as query parameters:

CACHE_URL=rediscache://127.0.0.1:6379/0?CLIENT_CLASS=site_ext.cacheclient.GracefulClient

Now in your settings, env.cache() returns:

{'BACKEND': 'django_redis.cache.RedisCache',
 'LOCATION': 'redis://127.0.0.1:6379/0',
 'OPTIONS': {'CLIENT_CLASS': 'site_ext.cacheclient.GracefulClient'}}

Want to pass in a dict?

The keys and values are comma-separated key=value strings, like key1=value1,key2=value2. Then assign that string to the environment variable like so:

DICT_VAR=key1=value1,key2=value2

The road to Docker, Django and Amazon ECS, part 1

Introduction

After we decided to move the web infrastructure of National Geographic Society (the non-profit side) to Amazon Web Services, we decided it would be a good time to also wrap all our pieces in containers.

However, being new to the whole container ecosystem, I couldn’t find any good writings about doing containers for production; just lots of tutorials, intros and bits-and-pieces of advice. So I am going to document my journey for the next person.

Why switch to containers?

We had two basic goals:

  • easily and reliably automate the creation of testing and production instances from scratch.
  • easily and quickly create new projects and services.


To accomplish these we further decided:

  • break up our infrastructure into functions that can be independently managed, scaled and shared across projects.
  • each service should have its configuration declared. All dependencies, from OS-level libraries to external services were set in a configuration.

Containers seemed the best fit for the requirements to meet our goals.

The current situation

Our current application stack is:

  • Django 1.8 and 1.9 using Python 2.7 in virtual environments
  • gunicorn acting as the HTTP server
  • nGinx serves static media and acts as a reverse proxy for gunicorn
  • PostgreSQL 9.5 for the database
  • ElasticSearch for a search index
  • Redis acting as a low-level cache and asynchronous-queue store
  • Varnish in front of everything for caching

Deployment involves issuing one command. That command uses Fabric to tag a git commit for deployment, and then runs several commands on each production server to check out the correct code, migrate the database, update any static files and tell gunicorn to reload itself.

For testing, we have a the ability to deploy any git branch onto our testing server where it runs nearly independently from the other test instances (the test instances typically share the same database, but don’t have to). This allows people go to <ticketname>.test.nationalgeographic.org to test their changes.

The route we are taking

We can’t just containerize our code as-is. We need to make a few changes first.

Switch to environment variable configuration. It takes several different configurations to run a project. Development, testing and production settings are currently separate files. We tell Django the settings file to use at the command line. We want to switch to using environment variables for greater flexibility.

Change how gunicorn runs. Django isn’t the only thing that needs different configuration, gunicorn does as well. It needs to run in the foreground, and change its logging parameters.

Have the container serve its own static media. In the past, this would have seemed monkey-balls crazy to do this, due to performance. However, with a CDN in front of these, performance is not really an issue. It also reduces complexity, as a simple service does not require something else just to handle the static media.

A method to run custom scripts and management commands without a shell. While it doesn’t happen often, sometimes a developer will SSH into a production server and run a management command or a custom script. Since we don’t plan to have SSH access into the containers, we need a way to do these rare tasks.

Change how we handle uploaded media. Our media handling is somewhat complicated. Uploaded media is asynchronously moved to a media server. This media server handles the delivery of the media, and also has a dynamic thumbnail generating script. We will have to make some decisions on where to store the uploaded files and how to handle the dynamic thumb nailing.

Until next time

I’ll cover these changes and others I discover in the next parts.