Jama Debating Scalability
Like many maturing companies Jama found itself in a situation where their monolithic software architecture prohibited scaling. Scalability is here a catch-all for many quality attributes such as maintainability across a growing team, performance, and true horizontal scalability. The solution was simple — on paper. Our software had to be split up. We are talking late 2013, micro-services are taking off, and a team starts carving out functions of the monolith into services that could then be deployed separately in our emerging SaaS environment. We are a SaaS company, after all. Or we are a SaaS company first. Or, well, we are a SaaS company which deeply cares about those on-premises customers that don’t move to the cloud… yet… for a variety of reasons, whether we like it or not.
Planning our Strategy
Will we keep on delivering the full monolith to on-premises customers, including those parts we deploy separately in SaaS? That would be a pretty crappy economic proposition for us, as we’d essentially be building, then testing everything twice. On-premises customers would not benefit any of the scaling benefits of the services architecture, nor can the engineering team really depart from the monolithic approach that is slowing them down. (On a side-note, as a transitional solution we’ve used this approach for a little while, and be assured that there’s little to love there.)
Then, will we deliver a monolith to on-premises customers that’s lacking a growing number of features, having those as a value add in SaaS perhaps? That works… up to a point… we currently have services like SAML, OAuth, and centralized monitoring in our SaaS environment, that aren’t available to our on-premises customers. They let us get away with that. But there is only so many services you can really carve out, before hitting something that’s mission critical to on-premises customers.
The only solution that makes sense: bring the services to the on-premises customers. (For completeness sake: there was this one time someone proposed not supporting on-premises installations anymore. They were voted off the island.)
So, services are coming to an on-premises near you.
Implications of Services
Huge. The implications are huge, in areas such as the following:
- Strategy. Since 2010 we have been attempting to focus on our SaaS model and in turn driving our customers to our hosted environment. The reality is that our customers are slow to adopt and requires us to refocus back to the on-premises deployment. That is okay, and there’s no reason we can’t do both, but it’s sobering to pull yourself back after so much focus went into “being more SaaS” (which came with the good hopes of the gradual transition of (all) customers to the cloud).
- Architecture. Our SaaS environment has a lot of bells and whistles that make no sense for on-premises customers, and it relies on a plethora of other SaaS providers to do its work, and this needs to be scaled down. Scaled down in a way that keeps the components still usable for both on-premises customers and in the SaaS environment.
- Usability. Coming from WAR deployments, where a single WAR archive is distributed, and loaded in a standardized application server (specifically Apache Tomcat), which is all relatively easy. We are now moving to a model with multiple distribution artifacts, which then also need to be orchestrated to run together as one Jama application.
- Culture. There is a lot of established thinking that had to be overcome, in fairly equal parts by ourselves and by our customers. I mean, change, there’s plenty of books on change, and on how it’s typically resisted.
Within Engineering (which is what I’ll continue to focus on), I’ve been involved in ongoing discussions about a deployment model for services, going back to 2014. One of the early ideas was to just bake a scaled down copy of our SaaS environment into a single virtual machine. (And expect some flavors with multiple virtual machines to support scalability.) Too many customers just outright reject the notion of loading into their environment a virtual machine that is not (fully) under their control. A virtual machine would be unlikely to follow all the IT requirements of our customers, and lead to a lot of anxiety around security and the ability to administrate this alien. So, customers end up running services on their machines.
That quickly leads to another constraint. The administrators at our customers traditionally needed one skill: be able to manage Apache Tomcat, running Jama’s web archive file (WAR). While we have an awesome team of broadly-skilled, DevOps-minded engineers working on our SaaS environment, we can’t expect such ultra-versatility from every lone Jama administrator in the world. We were in need of a unified way across our different services to deploy them. This is an interesting discussion to have at a time where your Engineering team still mostly consists of Java developers, and where DevOps was still an emerging capability (compared to the mindset of marrying development and operations that is now more and more being adopted by Jama Engineering). We had invested in a “services framework”, which was entirely in Java, using the (may I say: amazing) Spring Boot, and “service discovery” was dealt with using configuration files inside the Java artifacts (“how does service A know how and where to call service B”). It was a culture shift to collectively embrace the notion that a service is not a template of a Java project, but it’s a common language of tying pieces of running code together.
Docker and Replicated
In terms of deployment of services we discussed contracts of how to start/stop a service (“maybe every service needs a folder with predefined start/stop scripts”). We discussed standardized folder structures for log files and configuration. Were we slowly designing ourselves into Debian deb packages (dpkg, apt) or RPM (yum) packages, the default distribution mechanism for the respective Linux distributions? What could Maven do here for us? (Not a whole lot, as it turns out.) And how about this new thing…
This new thing… Docker. It is very new (remember, this was 2014, Docker’s initial release was in 2013, the company changed its name to Docker Inc. only as recent then as October of 2014). We dismissed it, and kept talking in circles until the subject went away for a while.
Early 2015, coincidentally roughly around the time we created the position of DevOps Manager, we got a bunch of smart people in a room to casually speak about perhaps using Docker for this. There was nothing casual about the meeting, and it turned out that we weren’t prepared to answer the questions that people would have. We were mostly talking from the perspective of the Java developer, with their Java build, trying to produce Docker images at the tail end of the Java build, ready for deployment. We totally overlooked the configuration management involved outside of our world of Java, and the tremendous amount of work there, that we weren’t seeing. And in retrospect, we must have sounded like the developer stereotype of wanting to play with the cool, new technology. We were quickly cornered by what I will now lovingly refer to as an angry mob: “there is not a single problem [in our SaaS environment] that Docker solves for us”. I’m way cool about it now, but that turned out to be my worst week at Jama, with a distance. Things got better. We were able to create some excitement by using Docker to improve the way we were doing continuous automated system testing. We needed some help from the skeptics, which gave them a chance to start adjusting their views. We recruited more DevOps folk, with Docker in mind while hiring. And we did successful deployments with Docker for some of our services. We were adopting this new technology. But more importantly, we were slowly buying into the different paradigm that Docker offers, compared to our traditional deployment tools (WAR files, of course, and we used a lot of Chef).
We were also telling our Product Management organization about what we were learning. How Docker was going to turn deployments into liquid gold. How containers are different than virtual machines (they are). They started testing these ideas with customers. And toward the second half of 2015 the lights turned green. Or… well… some yellowish, greenish kind of color. Scared for the big unknown: will we be able to harden it for security, is it secure, will customers believe it is secure? But also: will it perform as well as we expect? How hard will it be to install?
One of the prominent questions still also was around the constraint that I mentioned earlier, how much complexity are we willing to incur onto our customers? Even today, Docker is fairly new, and while there is a growing body of testimony around production deployments, all of our customers aren’t necessarily on that forefront. First of all, Docker means Linux, whereas we had traditionally also supported Windows-based deployments. (I believe we even supported OS X Server at some point in time.)
Secondly, the scare was that customers would end up managing a complex constellation of Docker containers. We had been using Docker Compose a bit for development purposes now, and that let us at least define the configuration of Docker containers (which I like to refer to as orchestration), and we’d have to write some scripts (a lot?) to do the rest. Around that time, we were introduced to Replicated, which we did some experiments with, and a cost-benefit analysis. It let us do the orchestration of Docker containers, manage the configuration of the deployment, all through a user interface, web-based, but installed on-premises. Not only would it offer a much more user-friendly solution, it would take care of a lot of the orchestration pain, and we decided to go for it.
Past the Prototype
The experiments were over, and I formally rolled onto the actual project on November 11th 2015. We were full steam ahead with Docker and Replicated. Part of the work was to turn our proof of concept into mature production code. This turned out not to be such a big deal. We know how to write code, and Docker is just really straightforward. The other part of the work was to deal with the lack of state. Docker containers are typically stateless, which means that any kind of persisted state has to go outside of the container. Databases, data files, even log files, need to be stored outside of the container. For example, you can mount a folder location of the host system into a Docker container, so that the container can read/write that folder location.
Then the realization snuck up to us that customers had been making a lot of customizations to Jama. We had anticipated a few, but it turns out that customers have hacked our application in all sorts of ways. Sometimes as instructed by us, sometimes entirely on their own. It was easy enough to look inside the (exploded) WAR file and make a few changes. They have changed configuration files, JavaScript code, even added completely new Java files. With Docker that would not be possible anymore, and we dealt with many such customizations, coming up with alternative solutions for all the customizations that we knew of. Some configuration files can again be changed, storing them outside of the container; some options have been lifted into the user interface that a root user in Jama has for configuring the system, storing them in the database; and sometimes we decided that a known customization was undesired, and we chose not to solve it. By doing all that, we are resetting and redefining our notion of what is “supported”, and hopefully have a better grasp, going forward, on the customizations that we support. And with it, we ended up building a lot, a lot of the configuration management that was initially underappreciated.
Ready for the Next Chapter
Meanwhile, we are now past an Alpha program, a Beta program, and while I’m writing this we are code complete and in excited anticipation of the General Availability release of Jama 8.0. We have made great strides in Docker-based configuration management, and learned a lot, which is now making its way back into our SaaS environment, while the SaaS environment has seen a lot of work on horizontal scalability that will be rolled into our on-premises offering in subsequent releases — the pendulum constantly swinging. While I’m probably more of a back-end developer, and while “installers” probably aren’t the most sexy thing to be working on, it was great to work on this project: we are incorporating an amazing technology (Docker), and I’m sure that our solution will be turning some heads!
- Understanding ALKS 157: Ensuring Safety and Compliance with Automatic Lane-Keeping Systems - October 15, 2024
- [Webinar Recap] Building the Blueprint: Applying Requirements Management in the AEC Industry - October 10, 2024
- Quality Management System Regulation: Final Rule Amending the Quality System Regulation – Frequently Asked Questions - October 8, 2024