Docker and Configuration Management

At Jama we have migrated lots of functionally from Chef to Docker. Maybe that sounds a little odd? At first glance Chef and Docker don’t overlap much in functionality. But, as we develop our system with Docker, we constantly find ourselves referring to our Chef code for reference and as we implement new work in Docker, we find ourselves with functionality that allows us to deprecate our Chef based systems. What we’re looking at is two very different approaches to solving the same problems. We still have a lot of Chef kicking around so the rough edges of both Docker and Chef are constantly on my mind. These tools provide interesting points of comparison to each other.

Idempotency vs. Immutable servers

The configuration management tool domain specific languages (DSL’s) themselves are the most notable difference between Puppet and Chef. At times Puppet can feel painfully inflexible, while Chef makes sure to give you more than enough rope to hang yourself. Both try to push you toward good practices. Both encourage you to produce a catalog of idempotently applied declarations. At first glance, Docker seems to encourage you to throw out the best practices of configuration management tool languages; dockerfiles look suspiciously like shell scripts. Then you realize that every line of a dockerfile is creating a file-system snapshot and its clear that Docker is trying to solve some of the same configuration management challenges in a really different way. I should point out that nothing stops me from running Chef from within the Docker file but in practice I don’t know of anyone doing that.
Docker encourages you to wrap up most of your system as a reproducible and immutable image while configuration management tools encourage you to express your desired system state as a set of declarations. Both of these approaches help you avoid creating unnecessary complexity and both have weak points. With Docker I have concerns about how I will handle orchestration tasks in an emergency scenario. With configuration management tools I often find that what looked like declarative, idempotent code doesn’t actually handle edge cases like first boot or the existence or non-existence of a directory.
It feels like a step backwards to be doing configuration management using shell scripts and that’s what Docker has me doing. Things that I had solved using Chef’s declarative language start to turn into a procedural list of steps to create a desired configuration. On the other hand, it often feels like configuration management tools are taking on the impossible task of reimplementing and unifying the configuration languages of every piece of server software out there. At least when we fall back to shell scripts we naturally get back to handling configuration natively.
In the end, Docker or various configuration management tools all give us a way to approach defining our systems with code, code that gets checked into revision control systems.

The Image is the Cache

Docker effectively caches assets for us. As our Chef code gets more complicated we rely on more external artifacts. Although those artifacts are typically served by highly reliable services, as the number of them grows the odds that any one of them breaks on a given Chef run starts to be significant. We could eliminate that class of failure by hosting all of our artifacts locally, a strategy which requires complex artifact maintenance. Alternatively, we could use a caching proxy to access these artifacts. That reduces the risk that an artifact download fails but still would require us to maintain yet another highly available service. By using Docker, we are effectively caching the results of those downloads. We also move the impact of any failed downloads from deploy time to build time.
Failed downloads aren’t the only problem that we’d rather encounter at build time than at run time but tools like Chef don’t have a separate build time to take advantage of.
Running configuration management tools at deploy time is not only more fragile but also more time consuming. CM based deployments that I’ve worked on often take more than 10 minutes. AWS will usually get our AMI’s (host system images) into a running state in less than two minutes. We pay for that time and reliability advantage at deploy time with time spent and risk of fragility during the build.

Orchestration

Immutability itself has its disadvantages. Sometimes you can make a really small change on a system to solve a really big problem. The OpenSSL library on my Ubuntu system takes about 500K of disk space and installs in seconds with apt. To deploy the same with Docker requires a costly build step. And when our industry has wanted to deploy OpenSSL in the last couple of years, we’ve wanted to deploy it quickly.
Once, I worked in a place where Puppet controlled entries for the database servers through the /etc/hosts file on every host that would need to connect to them. One day we had a need to urgently update those host entries. Running puppet across our infrastructure could take a few minutes, but puppet was the system that had the information necessary to build the /etc/hosts file. On this occasion, the puppet server became overloaded by the level of parallelism that we were asking of it and stopped serving requests half way through the update. Rather than spending time recovering the puppet server we calculated new host entries with a shell script and pushed out the updates with parallel-scp. We would have liked a tool that could operate only on the /etc/hosts files but with all the information that the Puppet server had. In Docker small changes are even more expensive, as I described earlier in relation to OpenSSL.

Docker and Configuration Data

Chef’s databags leave us with key management pain. Docker’s golden path is very convenient when you’re working with Free Software but when you need to access artifacts that are restricted (when you need to use a password, for example as part of the download of an artifact referenced in your Dockerfile) it is very hard to keep that secret out of the Docker metadata for your image.
Docker encourages us to push as much of our configuration out of the image as possible. The natural place to move this configuration is environment variables; and Docker provides explicit support for this approach. But something has to set all those environment variables and Docker remains impartial about what. In our Chef managed environments, the same data comes from a global data store managed by the Chef server. Other configuration management tools provide similar features.
Docker doesn’t help us to manage such a global data store so we’ve been driven to start migrating our global configuration data into Consul, our discovery service of choice. Perhaps that’s a good thing; CM tools are a little awkward at the task while discovery tools can be very nimble.
Chef’s data bags at least address the issue of handling secret distribution, though awkwardly. Docker leaves us with the need to implement our own methods of safely distributing secrets needed at runtime.

Other Docker Advantages

Because we use Docker, we have a way to understand what produced our images. We can create a modified version of an image quickly. Once built, Docker images can be pushed out with less processing than configuration management code. So, despite their size, they don’t take longer to deploy.
Docker pushes us in the direction of immutable servers, servers that you never fiddle with by hand but only replace wholesale. Immutable servers set us up for really elegant zero-downtime deployments using the blue-green model. Blue-green deployments are the Holy Grail of DevOps, but there are many technical barriers to getting in a position to be able to use them. Without the constraints that Docker encourages, it would be a lot harder to lead an organization in the direction of blue-green deployments.

Docker in Production

Docker’s sweet spot is the development experience. People often express excitement when working with Docker for the first time; it is relatively easy to go from just starting out to having something that works. Things are a little less rosy for Docker in production. We’ve ended up building quite a bit of tooling around Docker to bridge the gap for our purposes.
We have not yet investigated Kubernetes or Docker Swarm to determine how we could use them with our system. One barrier to adapting these tools is the need to allow on-prem customers to deploy our containers in a simple way. That’s not a problem that everyone has to deal with but it affects how we can deploy Docker.
We have built a tool to share logic between the Docker builds of our different services, track dependencies and releases. It also helps work around the challenge of dealing with secrets during the build process. Another tool, that predated our use of Docker handles deploy-time configuration. It looks up context specific information in our AWS account and makes sure that the right values eventually get into place on the host system where a given Docker container will run. We also have tooling that collects boot-time information, from discovery services for example. Finally, our tooling handles clustering in AWS independent of any Docker-specific software.

End of Rumination …

Docker is a viable alternative to incumbent configuration management tools with its own advantages and disadvantages. Docker, like configuration management tools more generally, encourages certain good practices for integration and deployment of complex computing systems. The specific good practices that Docker encourages are different and perhaps a bit more radical than those encouraged by the more established configuration management tools.