Continuous Integration with Docker Containers

It is paradoxical, yet true, to say, that the more we know, the more ignorant we become in the absolute sense, for it is only through enlightenment that we become conscious of our limitations. Precisely one of the most gratifying results of intellectual evolution is the continuous opening up of new and greater prospects.
                                                                                                                       —Nikola Tesla

To fully understand the challenges and benefits that Docker Swarm brings, we need to start from the beginning. We need to go back to a code repository and decide how are we going to build, test, run, update, and monitor the services we're developing. Even though the objective is to implement continuous deployment to a Swarm cluster, we need to step back and explore Continuous Integration (CI) first. The steps we'll define for the CI process will dictate how we proceed towards Continuous Delivery (CD), from there towards Continuous Deployment (CDP), and, finally, how we ensure that our services are monitored and able to self-heal. This chapter explores Continuous Integration as a prerequisite for the more advanced processes.

A note to The DevOps 2.0 Toolkit readers
The text that follows is identical to the one published in The DevOps 2.0 Toolkit. If it is still fresh in your mind, feel free to jump to the sub-section Defining a fully Dockerized manual Continuous Integration flow . Since I wrote the 2.0, I discovered a few better ways to implement CI processes. I hope you'll benefit from this chapter even if you consider yourself a veteran CI practitioner.

To understand Continuous Deployment we should first define its predecessors, Continuous Integration and Continuous Delivery.
Integration phase of a project development tended to be one of the most painful stages of Software Development Life Cycle (SDLC). We would spend weeks, months or even years working in separate teams dedicated to separate applications and services. Each of those teams would have their set of requirements and tried their best to meet them. While it wasn't hard to periodically verify each of those applications and services in isolation, we all dreaded the moment when team leads would decide that the time has come to integrate them into a unique delivery. Armed with the experience from previous projects, we knew that integration would be problematic. We knew that we would discover problems, unmet dependencies, interfaces that do not communicate with each other correctly and that managers will get disappointed, frustrated, and nervous. It was not uncommon to spend weeks or even months in this phase. The worse part of all that was that a bug found during the integration phase could mean going back and redoing days or weeks worth of work. If someone asked me how I felt about integration back then, I'd say that it was closest I could get to becoming permanently depressed. Those were different times. We thought that was the right way to develop applications.

A lot changed since then. Extreme Programming (XP) and other agile methodologies became familiar, automated testing became frequent, and Continuous Integration started to take ground. Today we know that the way we developed software back then was wrong. The industry moved a long way since those days.

Continuous Integration usually refers to integrating, building, and testing code within the development environment. It requires developers to integrate code into a shared repository often. How often is often can be interpreted in many ways and it depends on the size of the team, the size of the project and the number of hours we dedicate to coding. In most cases, it means that coders either push directly to the shared repository or merge their code with it. No matter whether we're pushing or merging, those actions should, in most cases, be done at least a couple of times a day. Getting code to the shared repository is not enough, and we need to have a pipeline that, as a minimum, checks out the code and runs all the tests related, directly or indirectly, to the code corresponding to the repository. The result of the execution of the pipeline can be either red or green. Something failed, or everything was run without any problems. In the former case, minimum action would be to notify the person who committed the code.

The Continuous Integration pipeline should run on every commit or push. Unlike Continuous Delivery, Continuous Integration does not have a clearly defined goal of that pipeline. Saying that one application integrates with others does not tell us a lot about its production readiness. We do not know how much more work is required to get to the stage when the code can be delivered to production. All we are striving for is the knowledge that a commit did not break any of the existing tests. Nevertheless, CI is a vast improvement when done right. In many cases, it is a very hard practice to implement, but once everyone is comfortable with it, the results are often very impressive.

Integration tests need to be committed together with the implementation code, if not before. To gain maximum benefits, we should write tests in Test-Driven Development (TDD) fashion. That way, not only that tests are ready for commit together with implementation, but we know that they are not faulty and would not pass no matter what we do. There are many other benefits TDD brings to the table and, if you haven't already, I strongly recommend to adopt it. You might want to consult the Test-Driven Development (http://technologyconversations.com/category/test-driven-development/) section of the Technology Conversations (http://technologyconversations.com/) blog.

Tests are not the only CI prerequisite. One of the most important rules is that when the pipeline fails, fixing the problem has higher priority than any other task. If this action is postponed, next executions of the pipeline will fail as well. People will start ignoring the failure notifications and, slowly, CI process will begin losing its purpose. The sooner we fix the problem discovered during the execution of the CI pipeline, the better we are. If corrective action is taken immediately, knowledge about the potential cause of the problem is still fresh (after all, it's been only a few minutes between the commit and the failure notification) and fixing it should be trivial.