Best Practices

How To Manage Releases in the Age of Continuous Deployment


As the Release Manager at Jama, I feel compelled to write about this topic because although much of the release process at Jama currently requires a lot of human involvement and manual action, one of Jama’s Engineering goals is to move closer to Continuous Delivery of our software. I’ve been forced to ask myself whether my job, and the role of Release Management, will become less relevant as we move in this direction. Indeed, in companies that are actually practicing Continuous Deployment, what role, if any, does Release Management play in their process?

First, some clarification of terms. Many people use the words “Continuous Delivery” and “Continuous Deployment” interchangeably, but really there’s a crucial difference. For a concise elucidation on this topic, see Martin Fowler’s second keynote video on this page: Software Development in the 21st century | ThoughtWorks.

In a nutshell:

  • Continuous Delivery is having all of the infrastructure and verification in place that is needed in order to push updates to a production environment at any time, i.e. every merge to mainline is a “release candidate”.
  • Continuous Deployment is actually doing it, i.e. pushing updates into production as soon as they become available.

Both Continuous Delivery and Continuous Deployment rely on a third concept:

  • Continuous Integration is a software development practice in which all developers commit to the mainline code branch every day, every commit is built automatically by an integration server and subjected to an automated battery of tests, and broken builds are fixed immediately. (There are a few additional tenets of CI, described here.)

Another clarification: Continuous Deployment may not actually mean that every merge to mainline is deployed to production. There are successive quality “gates” or thresholds which must be met in order for a build to be considered ready for production deployment. Imagine it this way: There are 10 merges to mainline within the past 24 hours. Each of these is compiled and subjected to unit and integration tests. Let’s say that 8 of these builds pass the first quality gate. Each of the passing builds then undergoes another quality gate, say static code analysis and automated regression tests, and only 6 of the builds pass. This subset of builds is then subjected to a range of performance tests, and only 4 of the builds exhibit the performance characteristics desired. Eventually, the Engineering organization has confidence that since the build has passed a certain number of these quality gates, it is ready to be deployed to production as a canary release. Of these canary releases, it is possible that only a few will be deemed stable and performant enough to finally roll out to the rest of the users via blue-green deployment. Continuous Deployment does not promise that every merge to mainline goes to production; but it does promise that the process of testing and winnowing down the builds through successive quality gates is automated, and that all aspects of the process are traceable, reproducible and measurable.

The benefits of Continuous Delivery and Continuous Deployment are multifold, but the biggest benefit is the ability to deliver value to users quickly and reliably. This is attractive to business stakeholders who want to satisfy customer’s needs, as well as the engineers, who are able to receive prompt feedback about the behavior of their code in production. Additionally, in order to move at this pace all the steps in the build-test-deploy pipeline must be automated, and all inputs to the pipeline — e.g. configuration, code, deployment scripts, etc. — must be in version control. Creating and delivering a “release” of software becomes trivial, since each build of mainline is a potential release.

To what extent does Jama do Continuous Delivery? The first step towards Continuous Delivery is to practice Continuous Integration, and Jama has been practicing CI for a couple of years already. Our Developers merge directly to our mainline code branch multiple times a day. A job scheduler (in our case Team City) monitors our git repos and any time a new commit appears, it automatically kicks off a new build, which includes unit and integration testing. The results of the build are fed to a very visible “radiator board” or a “build board” that is green as long as the builds pass and immediately turns red when a build fails. When a build fails, all developers are sent an e-mail and no further commits are allowed to mainline until the build is fixed. Fixing a broken build is the highest priority for Developers.

However, Jama is not yet able to practice Continuous Deployment, from a technical standpoint. Firstly, we can’t deploy multiple builds to production each day because given Jama’s current architecture, most upgrades of Jama requires us to pull down a large part of the system for maintenance as we replace a .war file and restart Tomcat. Continuous Deployment at Jama will not be technically possible until the core Jama application supports load-balancing, and updates to Jama and all its services can be delivered live via blue-green deployments.

Moreover, we have to consider the impact of Continuous Deployment on our internal validation process, as well as impact to our highly regulated customers. With our recent successful certification by TÜV SÜD for ISO 26262 fit for purpose, Jama is the first vendor that is both SaaS and Agile to receive this certification. Each monthly Jama release undergoes a rigorous validation process and is re-certified by Tüv Süd. This validation and certification process has not yet been adapted to Continuous Deployment.

We are still a little ways off from even achieving true Continuous Delivery across the core Jama codebase. Our unit-, integration-, and performance-test results are not collected and reported on a per-merge-request basis, the results are scattered between different tools, and so identifying the “health” or “stability” of a given build requires manual effort. Performance testing is not automatically run against each merge, it currently must be planned and executed against specific builds. Our regression test library is steadily getting automated, but some of the functional testing is still manual and must be run at the end of a development cycle during a hardening phase.

Although we don’t yet practice Continuous Delivery across all of our components, we have started to implement patterns of best practice in some services. Our Search Service and our elastic search clusters are already containerized and treated as microservices, and each new change to these services is tested and released. Although we don’t actually deploy every release of these services into production, we have the option and the tooling needed to do so. The release cadence for the Jama core application is based on a predictable monthly release schedule, rather than a content-driven schedule. We release monthly to our Hosted and Express customers, and from the time a merge is made to mainline, it is never longer than 6 weeks before that change makes it to production. Our goal, though, is to reduce our time to market significantly over the next year. We are making excellent progress on automating our regression master library and making it a part of our build pipeline, and in addition we are close to having a fully automated performance testing suite that will run against the nightly build, against both MySQL and SQL Server. Our scalable platform team is working hard to update Jama’s architecture to make canary analysis and blue-green deployments on our core application possible, giving us the opportunity to push more frequently without disruption to our end users.

All this talk of Continuous Delivery is well and good for SaaS products, but Jama is also available as an on-premises installation. Is it possible to do Continuous Delivery for On-premises releases? In fact, it is no different from doing Continuous Delivery for Hosted. We use a technology called Replicated to deliver our on-premises releases to customers, which is built on Docker containers. Continuous Delivery for On-premises would just mean that we always have the docker containers built and ready to be promoted to our customers, and Engineering hands over the decision of when to promote a build to Product Management.

However, attempting true Continuous Deployment for On-premises releases is unrealistic. Experience shows that customers in highly regulated industries are unable to consume Jama updates more frequently than once a month at most, and many are unable to consume updates more frequently than once a year. This means that Jama will continue to have regular, packaged and fully contained monthly on-premises releases for the foreseeable future, with all the corresponding ceremony that entails.

So is Release Management relevant at a company that is practicing Continuous Delivery or Continuous Deployment? I would argue that for any company that is moving in this direction, a Release Manager should be included in the process of growing and learning how to practice Continuous Delivery in a responsible way. The Release Manager’s job may transition to involve less traditionally Project Management activities, and more advocacy, technical contribution and supervision of the release-deployment pipeline. The Release Manager should help keep traceability, reproducibility and measurability of all releases consistent at any stage of the Continuous Delivery spectrum, and establish the patterns necessary for sustained use (that is, “automate all the things”). Various technical challenges will arise, including but not limited to: determining a sound versioning scheme for rapid releases; helping to design and populate a Configuration Management system which contains exact information about the versions and specifications of production hardware, software and services; guaranteeing that the versions of all the services deployed together in production have been verified to work together; understanding and broadly communicating the criteria for a “successful” release and deployment, and monitoring production systems to ensure this criteria is met; helping to design and implement automatic rollback strategies for unsuccessful canary releases; working across departments to align around an Incident Management Process and representing the release- and deployment-related aspects of addressing internal and external escalations; etc., etc., etc. The Release Manager may also work with Engineering and other internal customers to determine the pain points in the existing Continuous Delivery tooling, and help to inform requirements and acceptance criteria for changes and additions to the tooling and automation. Even if it’s not a single person performing the role of Release Management, the function of Release Management is absolutely essential when doing Continuous Delivery and Continuous Deployment, to ensure continued end-user delight.

Release Management is constantly evolving as internal engineering infrastructure and deployment best practices change and evolve themselves. Any given solution today will inevitably need to be re-visited and updated as a company scales or the technology changes. Although it would theoretically be possible to “automate yourself out of a job” in the short-term, the state-of-the-art is changing so rapidly that by the time one part of a system is automated and performant enough for most users, it’s time to revisit and re-optimize another system or component. A Release Manager’s job is never done; a former colleague of mine once jokingly referred to it as a Sisyphean effort (referring to the Greek king who was punished by the gods, doomed to roll a large boulder up a hill only to watch it roll back down, and to repeat this for eternity). But as the release process at Jama improves and matures, the organization as a whole is able to climb its mountains faster, easier and with more confidence.