The road to Docker, Django and Amazon ECS, part 1

Introduction

After we decided to move the web infrastructure of National Geographic Society (the non-profit side) to Amazon Web Services, we decided it would be a good time to also wrap all our pieces in containers.

However, being new to the whole container ecosystem, I couldn’t find any good writings about doing containers for production; just lots of tutorials, intros and bits-and-pieces of advice. So I am going to document my journey for the next person.

Why switch to containers?

We had two basic goals:

  • easily and reliably automate the creation of testing and production instances from scratch.
  • easily and quickly create new projects and services.


To accomplish these we further decided:

  • break up our infrastructure into functions that can be independently managed, scaled and shared across projects.
  • each service should have its configuration declared. All dependencies, from OS-level libraries to external services were set in a configuration.

Containers seemed the best fit for the requirements to meet our goals.

The current situation

Our current application stack is:

  • Django 1.8 and 1.9 using Python 2.7 in virtual environments
  • gunicorn acting as the HTTP server
  • nGinx serves static media and acts as a reverse proxy for gunicorn
  • PostgreSQL 9.5 for the database
  • ElasticSearch for a search index
  • Redis acting as a low-level cache and asynchronous-queue store
  • Varnish in front of everything for caching

Deployment involves issuing one command. That command uses Fabric to tag a git commit for deployment, and then runs several commands on each production server to check out the correct code, migrate the database, update any static files and tell gunicorn to reload itself.

For testing, we have a the ability to deploy any git branch onto our testing server where it runs nearly independently from the other test instances (the test instances typically share the same database, but don’t have to). This allows people go to <ticketname>.test.nationalgeographic.org to test their changes.

The route we are taking

We can’t just containerize our code as-is. We need to make a few changes first.

Switch to environment variable configuration. It takes several different configurations to run a project. Development, testing and production settings are currently separate files. We tell Django the settings file to use at the command line. We want to switch to using environment variables for greater flexibility.

Change how gunicorn runs. Django isn’t the only thing that needs different configuration, gunicorn does as well. It needs to run in the foreground, and change its logging parameters.

Have the container serve its own static media. In the past, this would have seemed monkey-balls crazy to do this, due to performance. However, with a CDN in front of these, performance is not really an issue. It also reduces complexity, as a simple service does not require something else just to handle the static media.

A method to run custom scripts and management commands without a shell. While it doesn’t happen often, sometimes a developer will SSH into a production server and run a management command or a custom script. Since we don’t plan to have SSH access into the containers, we need a way to do these rare tasks.

Change how we handle uploaded media. Our media handling is somewhat complicated. Uploaded media is asynchronously moved to a media server. This media server handles the delivery of the media, and also has a dynamic thumbnail generating script. We will have to make some decisions on where to store the uploaded files and how to handle the dynamic thumb nailing.

Until next time

I’ll cover these changes and others I discover in the next parts.