The trend is certainly that services are deployed using Docker. We have currently Dockerized applications like our binary repositories, our build server, our Docker registry, ActiveMQ, Hazelcast, integration services, logging services, monitoring, OAuth, search, did I forget any? And there’s certainly more to come. In my ideal world we would be able to pay for infrastructure that we can just throw our Docker images at, which will then take care of running containers. I have good hopes that we’ll enter such a world in the future. Amazon made some sort of promise in this area with Amazon EC2 Container Service, which we’ve used for running Docker containers for about a year. However, our DevOps experts have ruled that it’s more trouble than it’s worth. Instead we’re taking a lot of the management of our containers in our own hands. This document gives an overview of how then, instead, services are created/instantiated in AWS.
The following diagram shows an overview of our approach, for a service of our own as an example. You will recognize the following steps, from left to right:
- The service is developed, often in Java, and using our regular build pipeline (Git, Maven, TeamCity), a JAR or TAR/GZ of the service is delivered into our binary repository.
- The JAR or TAR/GZ of the service is built into a Docker image using our build_docker tool, based on the definitions in the “docker” Git repository. For each Docker image that we support, this repository contains a template Dockerfile and a build manifest that is used by our build_docker tool. The resulting Docker image is delivered into our Docker registry.
- A machine image for AWS (AMI) is built using Packer, based on the definitions in the “packer” Git repository. (See below.)
- An instance of the machine image is created in Amazon Elastic Compute Cloud (Amazon EC2), using our Provisimus tool. (See below.)
Now, the above is just an example (although a typical one). Not all Docker images that are deployed are homegrown (for example, the Docker image for the binary repository, Artifactory, the Packer build directly obtains from its vendor). Also, not all services are necessarily developed in Java: major services that are developed in house certainly are all Java, but there’s also externally-obtained services that are shipped as described above, such as Hazelcast and ActiveMQ. We’re not a polyglot development organization, but we still end up with a mix of services, that are unified in their deployment, by following the above, which we deem highly beneficial.
Packer is an awesome tool from HashiCorp (hard to go wrong with those folks). For each Packer build that we support, the “packer” Git repository contains files that are needed to set up the machine, files that will be available in the resulting machine, and a Packer “template” that stitches it all together. The resulting machine image is stored in AWS as an AMI, which is Amazon’s format to define virtual servers in the cloud.
An AMI may include multiple Docker images that it has to run. But as a rule of thumb there’s always one “main” Docker image (typically the one from which the AMI name is derived); the others are side carts like logging and monitoring.
The machine image is based on a common distribution (typically Ubuntu 16.04 LTS). The Packer build adds whatever other software is needed, such as Docker and Docker Compose. Docker Compose is used to orchestrate the Docker containers that are to run on the instance of the machine image. The multiple Docker containers are specified in /bootstrap/docker-compose.yml. With docker-compose up -d the containers can be created and started on an instance of the machine image. During the Packer build, Docker images are already pulled into the AMI using docker-compose pull. This makes the machine image more complete, more consistent, instances will start up faster, and we don’t need credentials to our Docker registry, except at build time.
The Packer build may add other software or scripts, such as to do health checks, make regular snapshots of storage volumes, add users, etc.
Provisimus is an in-house command-line tool, built in Python. Provisimus has a number of management tasks, but for this discussion we’re mostly interested in its ability to create stacks in AWS. These stacks in AWS include the instance of the machine image, as well as other resources that are needed for the type of stack in question, such as databases, security, storage volumes, network devices and routes, etc. Provisimus generates an AWS CloudFormation template, and submits it for the stack in question to be created as outlined in that template. Creating an instance using Provisimus is typically a manual task, for the moment, although Provisimus knows to create auto-scaling groups (for some stack types, where that makes sense).
When I google “codifying”, I end up in law and legal land. But for us it means making sure that steps are captured in programs and configuration files that are stored in a version control system, and that can be executed repeatedly in a reliable manner. When creating an instance, its environment is defined in Provisimus, the instance itself is codified using Packer, and the application is wrapped in a Docker image, all of which can be found in their respective Git repositories. At the level of instances, little should be happen manually anymore, beyond trying something out and codifying it, and pushing your Git commit, before you head home.
Using Docker, Packer and our in-house CloudFormation tool, we created a reliable, codified pipeline for building and deploying service instances in AWS consistently.