The road to Docker, Django and Amazon ECS, part 5

For part 1

For part 2

For part 3

For part 4

Making the container able to serve its own static media files

Since each container will be behind a CDN or other caching proxy, we want each container able to serve its own static media files. We are using Whitenoise, a Python WSGI middleware designed specifically for serving static media.

Installing Whitenoise

The installation of Whitenoise was pretty easy:

Update requirements. Add whitenoise==3.2.2 and brotlipy==0.6.0 to requirements.txt.

brotlipy enables the latest type of compression brotli.

Add it to MIDDLEWARE_CLASSES. Add 'whitenoise.middleware.WhiteNoiseMiddleware' nearly to the top of the list (right under the Django security middleware).

This middleware should intercept the request and deal with it, if the request is for a static file it knows about.

Change STATICFILES_STORAGE. Change it to 'whitenoise.storage.CompressedManifestStaticFilesStorage'.

This does everything that staticfiles's ManifestStaticFilesStorage class does, but also compresses each file using Gzip and Brotli.

Environmentalize the STATIC_HOST setting. Since this setting can change from project to project and environment to environment, we need to make it easy:

STATIC_HOST = env('DJANGO_STATIC_HOST', default='//media.nationalgeographic.org/')
STATIC_URL = os.path.join(STATIC_HOST, '/static/')

We also modified the docker-run.sh script we are using to test it locally, adding -e DJANGO_STATIC_HOST=http://localhost/ so it will look to the container for static media.

The three problems you meet on the way

Halt!

Right after getting Whitenoise installed we built and ran the container, expecting glorious results. We got gunicorn.errors.HaltServer: <HaltServer 'Worker failed to boot.' 3>

WTF!? This was just working! Even after backing out all the changes made to install Whitenoise, we got the error. To make it even worse, we couldn't replicate it outside of the container.

Lock and --preload We tried using gunicorn's --preload option to force it to load the entire code into memory. Reports say that this can sometimes give you an idea of what is going on. It did. We then got:

Error creating new content types. Please make sure contenttypes 
is migrated before trying to migrate apps individually.

But no stack trace or any indication where to look further.

Waitress, check please! After quite a bit of frustration with this, I came upon this article that says we shouldn't even be using gunicorn in docker (well, Heroku). We should, instead, use waitress.

So with waitress installed, we ran:

waitress-serve --port 8000 --expose-tracebacks conf.docker_wsgi:application

Notice the --expose-tracebacks parameter? Yea! Now we got a proper traceback. The result was really anticlimactic. We had just merged the master branch into this one to keep it up-to-date and recent code change caused this strange error. It took 20 minutes to fix, but hours and hours to find.

As a result, we are going to keep going with waitress.

Rethinking the build process

Our initial thoughts on the production build process was that our CI/CD server would do it. We wanted the process to be as isolated as possible. After implementing whitenoise, we changed our minds.

Whitenoise compresses the files each time you collect static media, so it takes significantly longer to start up. Also, Docker's image doesn't save any changes made to it while its running, so all the compression and stuff done to the static media files isn't saved between runs.

We will have to have a test database available for use by our build server and do a local deploy so we can collect all the static media before we build the image. Then we will copy the compiled static media to the image. This makes the docker image bigger, but faster to run each time.

We made a build script to make it easier locally:

#!/bin/bash
echo "Removing old images"
docker rmi `docker images | grep "^<none>" | awk '{print $3}'`

echo "Collecting static files"
./manage.py collectstatic --noinput --verbosity 1

echo "Concatenating CSS and JS"
./manage.py compress --force --verbosity 1

docker build -t ngs:latest .

And we removed staticmedia from the .dockerignore file so the staticmedia directory will get copied to the image.

Why won't you take cache?

With the first two problems solved, we were back in business. The web site came up, static media and all. When we looked at the headers for the static files, we saw some HTTP headers we didn't want, and didn't expect:

Cache-Control:no-cache, no-store, must-revalidate, max-age=0

Whitenoise is updates the caching headers. Something else was altering them. While we sent that bug off to have someone else find it, we simply used whitenoise as WSGI middleware:

import os
import sys

from whitenoise import WhiteNoise
from django.core.wsgi import get_wsgi_application

sys.stdout = sys.stderr
application = get_wsgi_application()
PROJECT_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), '..'))
application = WhiteNoise(application, root=os.path.join(PROJECT_ROOT, 'staticmedia'), prefix='static/')

That made everything work as expected.

Next Time

Next time we'll see what it takes to get this Docker image into ECS and connected with other services.