Continuous Delivery - From Theory to PracticePosted By: Yaron Parasol on March 15, 2015
View the full article with its original diagrams on DZone in the DevOps Zone or download the DZone 2015 Guide to Continuous Delivery.
When we discuss continuous delivery (CD), the starting point would need to be the motivation that has driven this IT evolution. The demand for continuous delivery was brought on by businesses’ need for more agility and faster time to market, where time to market is the primary motivation, and agility is the means to achieving that. If manual delivery processes used to involve running a compiler by hand, then creating a binary, then copying it manually to a server, and then restarting that server; clearly the time to market in such scenarios was long and complex, but even more so, was error prone due to the manual involvement.
Another two important driving factors are the need for tighter and faster feedback loops, as well as the reduction of work in progress or process (WIP). WIP is a term from the inventory world, which is concept that refers to a company's partially finished goods waiting for completion and eventual sale or their subsequent value, where in the interim these goods that are unable to reach the market just eat up company resources such as storage and bound capital.
This is why the whole agile methodology, with sprints and scrum, was even started in the first place. This then led to the development of continuous integration (CI), to automate build processes to speed up delivery, and minimize human involvement, to prevent error. However, even with CI, most deployments still remained manual. At best, parts of the deployment process were automated, but only in the context of really small closed environments, that are basically non-configurable environments.
Cloudify - build a robust continuous delivery pipeline with a simple blueprint. Go
However, if the ultimate goal, is to get the new feature that the developer writes to production, in a matter of hours rather than months or half a year, there was still a need to fill the gap between agile development and CI, and actually pushing frequent, small updates to production.
Getting the code to production is where continuous delivery comes to complete the entire cycle from development to market. To do that you want to be able to push a minimal scope of new code in a way that is easily testable and with limited exposure, to monitor it closely, and roll it back if it’s not good.
Continuous Integration in the Real World
While the motivation and methodology is generally clear, the actual execution is another story entirely. With CI/CD we’re talking about a streamlined process that’s quite complex. This means we don't want human intervention, or at the very least to minimize it significantly, and we want to extend the automation all the way to production, all while combining this with testing and intelligent monitoring.
So CI was essentially the starting point, let’s take a look at CI, and then where we need to get to - and the ideal way to bridge such a gap.
Diagram 1 - Typical CI Process
[It’s very important to note that there's a lot of unit and integration testing involved between steps #3 & #4, before anything gets published or pushed to production, to ensure nothing breaks and everything is functioning as it should.]
IaaS and Containers in the CI Process
After you’ve built a working automated build process, there is another important challenge to overcome before being able to push to production, and this is the matter of controlled environments. This is basically the primary reason IaaS and containers have become very popular in the world of CI. The three leading challenges with controlling environments have been working in clean environments, ensuring proper utilization of resources, and parallelization of processes. This is due to a number of reasons:
“Dirty” machines (those with packages already installed on them, with unclean configuration and files spread around) are bound to cause inconsistencies with running tests and building binaries.
To ensure more efficient utilization of resources i.e. so environments aren’t manually set up and occupied unnecessarily for extended periods of time, resulting in others not being able to use them. This then leads to the need to buy added hardware and software to accommodate this lack of resources.
To enable parallelization of the process by spawning up a set of VMs/containers and the running of different test suites/builds on each.
For these reasons, with typical integration and testing, it has become popular to use IaaS. With IaaS you can build the on demand resources you need (compute, storage, and network resources) within just a few minutes, and if something goes wrong you can easily wipe the entire environment, and do this all over again.
Of course, the most ideal way to do this, is to automate these processes. You can either choose the do it yourself method of using scripts, and install the application stack using configuration management tools, or even more scripts. This means starting the application components, cloning a git repo, and running the build process, which then yields a binary or package with the updated code. Then you have your binary (which is really only half the process). Then you need to install the binaries on the environment you just created and run the integration tests.
The benefit of this, is a more robust process that is less error prone (due to the minimization of human involvement).
This solves the issue of clean environments, and enables parallelization as resources are on demand, the disadvantage of using IaaS is again the under-utilization of resources. That’s where containers come into the mix.
The use of containers basically provides the same as VMs, but more lightweight, so if a VM takes minutes to load - a container can be spawned pretty much instantaneously. It’s a fresh environment, utilizes minimal resources, and enables parallelization, what’s more since it’s so lightweight the binaries are also released much more quickly.
This is where CI processes become interesting, a new form of CI that is a combination of application CI and automation CI.
Diagram 2 - Combined CI Process
Now let’s take a look at CI in the yet another flavor - containerization flavor..
Diagram 3 - Containerized CI Process
There is also the need for container orchestration - to time the creation of the containers, and tie the different application components together, however that is a whole post unto itself, so I can just suggest further reading on container / Docker orchestration.
Continuous Delivery...From Theory to Practice
After we’ve taken a look at typical CI, and then focused on CI in an IaaS environment or with a containerization flavor, we can segue into CD.
So you would assume that after you have the application packaged and tested, it should be pretty easy to deploy to production, right? Well the answer is actually, no. Unfortunately, this has proven quite difficult in real production environments.
There are several obstacles when it comes to zero time deployments.
After you’ve automated your build process, it’s not just a matter of automatically deploying your code to production. For starters, the build process is not done in production, so ultimately it has no business risk. Therefore, when deploying your code, more than anything the process is actually more important than the actual pressing of the button at the end of the day -- or a thousand times a day for that matter. Before you can achieve this level of continuous delivery, you need to make sure that you aren’t jeopardizing your system, users, and ultimately your business, in the process.
This means that you need to know what makes the entire system work, where the potential problems are, and make sure that your deployment stack and new package take all this into consideration in advance. To do so, you need to ensure you properly test your code prior to deploying to production, and then after your code is deployed that you are actually monitoring the right things. The “right things” mean that you are aware of the changes that are being made, and how they can affect the system as a whole - e.g. if the deployment process contains a change in my database schema, you'd need to make sure that there is a process in place that ensures that when you deploy that the database schema stays intact (and that’s just one small example).
Monitoring deployments is a whole bible unto itself, and so it would be too lengthy to discuss this in detail. However the focus on this is probably the most important factor, since the ability to deploy 1000 times a day is worthless if you are unable to monitor how these changes affect your system. Since, let’s be honest, you’re not looking to deploy new features a thousand times a day, deploying a thousand times a day gives you the ability to fix things really fast and make small changes quickly.
That’s why you need to have the entire process set up in such a way that enables you to quickly understand exactly what’s going on in your system - and only then will you have the ability to deploy as many times as you want. This means monitoring the right places, the right KPIs and metrics - CPU/load/memory, seeing if there is any performance lapse - and when you reach the level of numerous deployments a day - to note whether there is also gradual performance lapse, which is often overlooked. This is just the tip of the iceberg.
This has become exponentially more difficult these days with autoscaling capabilities, frequent changes to servers, locations of servers. Where once upon a time you had two servers and everything was simple - these days you have thousands of servers distributed around the world, and multiple processes to deploy all the time.
Reaching the Continuous Delivery Promised Land
When you’re ready to deploy your code to production, you would need to write a process to ensure you take all aspects of the deployment into account, this would typically include:
Choosing the right tool for the job (as an aside, generally speaking, the tool isn’t really the problem with deployment, the process just needs to be ironclad. That said, there are tools that are less deployment oriented like common CM tools, where scripting tools may do a better job e.g. Fabric or Ansible that are more deployment oriented.)
Automating the process of pulling your server list (taking Amazon as an example, you can use tags to tag the different server types and then deploy to the right servers based on their type.)
Choosing the type of deployment process. (There are a few common controlled deployment methods - Canary, Blue-Green (AKA A/B) and there is much to be said about these - the most important aspect of each lies in the next bullet).
Monitoring the right things - so you can know when and what to rollback.
And while we can use these methods to deploy code - if we want to continuously deploy applications as a whole, including the infrastructure (not just the code) - AKA infrastructure as code, that’s where orchestration would come in. We would still need to perform the same initial steps:
Understanding the process and how it affects our system,
How to create the binaries/ logical containers in a clean environment
However, on top of all of this, we could add code that creates infrastructure, including:
Loading the resources
Keeping the infrastructure in source control, and
Taking the binaries created and deploying them with the infrastructure when the deployment is run.
So whether you choose to do a canary or blue-green deployment, an orchestrator will come in very handy in either scenario to manage business continuity and data integrity. With the complexity involved with pushing code to live systems in production, the orchestrator should be built in a manner that enables you to address the entire application lifecycle - a good example for this is TOSCA (Topology and Orchestration Specification for Cloud Applications). TOSCA is an open cloud standard language from the Oasis organization (the same organization that brought us XML) that is based on YAML.
TOSCA has the combination of declarative descriptions of the application topology with all its components - including the load balancer, network, the compute resources, software and everything else, along with an imperative set of workflows to describe the logic of any process we need to automate. What this means from a continuous delivery perspective, is that with TOSCA topology each application component has lifecycle hooks, that enable the adding of more hooks to cope with new processes, such as the invocation of A/B testing of deployments, testing, and monitoring.
With these new CI/CD capabilities, a line of business that once upon a time had to go through the process of requesting a feature from engineering based on business-level requirements, and then spend another year waiting for these to actually reach the market, organizations can now expect a new feature to be shipped to production within just a few weeks. Needless to say the business impact of such processes is driving an unprecedented evolution in IT, that will only progress and gain momentum in the near future, however, with all new technology - continuous delivery needs to be implemented with safe measures while taking all of the complexities into account - as the negative business impact of CD gone wrong can by far out-weigh the positive aspects.