"CI/CD" is short for Continuous Integration and Continuous Deployment (or sometimes "Delivery"). What this means in practice is automating build, test and deployment of software. As a technique this is pervasive, and is widely accepted as standard practice. This is a practical guide on how to do CI/CD well.
Let's start with some definitions.
Automated Build & Test (aka "CI" — Continuous Integration)
- AIM: Find defects before others are impacted.
- HOW: Automatically build and test code at appropriate points to check that it works as expected when integrated with the rest of the codebase and other recent changes.
- INPUTS: Source code and dependencies (e.g. libraries, base images).
Automated Deployment (aka "CD" — Continuous Deployment / Delivery)
- AIM: Deploy the system to environments so it can be used for testing or for production workloads.
- HOW: Scripted deployment of build artefacts (e.g. software or infrastructure configuration) to environments, either triggered automatically or manually.
- INPUTS: Build artefacts from Automated Build & Test stage.
- OUTPUTS: Working system on test or production environment.
Good use of source control is an important foundation for good CI/CD.
Source code should be stored in a source control (or "version control") system. The de facto standard is Git, and many Git-based systems exist. Some of the most popular are GitHub, GitLab and Bitbucket. All of these add functionality in addition to what's provided with basic Git. The most important of these additional features is pull requests (in some systems known as a merge requests), which are a key enabler for CI/CD.
It's essential to define a clear branching strategy and stick to it. A good default is "trunk-based development" with "short-lived feature branches".
Trunk-based development means that integration is done on a single main branch. Each small feature (e.g. Jira ticket) is implemented on its own feature branch off the main branch. Each feature branch should only exist for a short period, typically a maximum of 1 or 2 days, before it is merged via a pull request (or merge request). Ideally each ticket will be similarly small, but where this is not possible then it is best to build up the functionality for the ticket in several short-lived branches rather than in one long-lived branch. This regular merging to the main branch makes it easier to deal with potential clashes between code changes for individual features.
The main branch should be configured to forbid direct pushes and to require pull requests with at least one approval before merging is allowed. The main branch should also be configured to require a successful CI build before merging.
The most basic task handled by the CI stage is building the code into a deployable artefact such as a set of JARs or a Docker image. This stage should only "pass" if the code compiles (if relevant), and passes any linting or other static code analysis checks. Artefacts are captured for use in later deployment stages. The exact mechanics of capturing artefacts depends on the CI/CD system being used. Most systems provide some mechanism for this, but in some cases you may need to implement your own — for example, pushing to an S3 bucket.
Each build of the software should be labelled with a unique identifier, such as a semantic version or a simple incrementing number. It's easiest to keep track if these identifiers are clearly ordered. When the system is deployed and running, it should advertise the build version - this could be in the footer of a web UI, or on a /version endpoint for an API. This makes it easy to tell which version of the code is deployed where. There should be an easy way to tell which revision of the code went into each build, for example by tagging the source code repository with the version identifier. This makes it easy to identify the code changes between any two versions of the software. Some build systems include a mechanism to generate unique incrementing build IDs, but you may need to implement your own. One approach to take is based on examining existing Git tags to calculate the next version. The benefit of versioning each build separately from the revision of the code is that sometimes two builds of the same code can give different outputs, depending on how dependencies are versioned and how the build process is configured.
As well as building the code, the other main task handled by the CI stage is the regular running of automated tests. Broadly speaking, automated tests are either at the "code-level" or "system-level".
Code-level tests typically use a unit test framework, and operate by directly calling individual fragments of the application code. This means that the tests must usually be written in the same language as the application code. Of these, unit tests are the most fundamental and verify that individual components work as expected when isolated, using mocks or other "fake" objects if necessary. Some kinds of integration tests also operate at the code level, and focus on the integration of the code with external dependencies such as a database, message queue or HTTP API. Such tests require these dependencies (or stub implementations of them) to be running before the tests can be run.
System-level tests operate on the running application rather than individual parts of its code. System-level tests include those such as functional tests (using Selenium or other tools of that nature) and non-functional tests such as automated penetration or load tests. The common feature is that these tests interact with the system using its external integration points (such as an API or web UI), rather than by calling code. In some cases, a unit test framework may still be used, but since the tests do not call the application code directly these tests can be written in any language.
Environments and data
Any tests which involve running the system (possibly along with other systems it interacts with) have to consider whether to create an ephemeral environment or use a long-lived one. Temporary environments can either be created within the CI system, using something like Docker Compose, or elsewhere, in a temporary cloud-hosted environment. Any systems that have "state", such as a database, need to be brought into a known state before the tests are run, in order to ensure consistent results. If using a long-lived environment, stateful components also need to be brought into a known state, usually accomplished by having a pre-test stage which injects specific test data. Ephemeral environments give the best guarantees that they are "clean", with no unexpected data or configuration deviations which could invalidate the test results. However, long-lived environments can also have their advantages. They avoid the potentially time-consuming environment creation for each test run, are often easier or more practical to create, and can more closely mimic the behaviour which can be expected when deploying to production.
This stage involves applying the code to an environment. Typically, systems have a single production environment which provides the live service, and one or more non-production environments for testing. A common setup is to treat these as a pipeline of environments — for example, development, staging, and production — with changes "promoted" through the environments. The exact set of environments and what they are used for varies, but the basic concept of a single pipeline with a one-way flow of changes is common. Changes are often automatically deployed to the first environment following a successful CI build, but require manual approval for deployments to subsequent environments. Not all CI/CD tools support manual approval, so this is one area to consider when choosing a tool.
Build once; deploy anywhere
A good CI/CD pipeline works by incrementally increasing the confidence in a given version of a piece software as it passes through the pipeline of test environments. By the time the version reaches the point at which it is deployed to production, the pipeline should have provided an acceptable degree of certainty that the software is "good", and fit for release. For this to work effectively, tests from each stage must give useful information on the quality of the version being tested. This information must be applicable to subsequent stages of the pipeline as well, so the body of evidence can be accumulated rather than all tests needing to be repeated for each different environment.
For this to be true, environments must be acceptably consistent, and differences need to be understood and appropriate, as discussed above. However, it's also important that the behaviour of the software being tested is itself the same in each environment. The easiest and best way to achieve this is to build a single artefact which is then deployed to each environment unchanged, the modification of any environment-specific configuration notwithstanding. One alternative to this approach is to build the software specifically for each environment. We can mitigate the inherent risk in this approach by carefully "pinning" all our dependencies, allowing us to be confident that our produced artefacts are consistent. However, there is always the possibility with this approach that the version deployed to the production environment is subtly different to that which was tested.
Promote chosen version to environment
A typical workflow involves:
Automated deployment to a
devenvironment whenever new code is pushed to the main branch.
Manual promotion of a chosen "good" build from the
devenvironment to a
testenvironment, when the team are ready to accept new changes.
Manual promotion of a chosen build from
One important thing to note is that the manual approvals step needs to allow for the selection of a specific version of the code to approve. This is not a feature supported by all CI/CD tools — AWS CodePipeline, for example, does not support this.
Credentials & secrets
There are typically several places where "secrets" come into play in CI/CD systems. Self-hosted CI/CD systems can generally be more secure because they can benefit from indirectly-granted permissions via mechanisms such as IAM roles. However, this needs to be traded-off against the convenience of cloud-hosted options. In some cases the security considerations will be the priority and it will be worth the operational overhead of managing a self-hosted CI/CD system; in other cases a cloud solution may be acceptably secure and may be a better choice.
There are several situations in which need for secrets typically arises in CI/CD systems:
The CI/CD system needs to retrieve source code, and possibly other inputs such as dependency libraries or base images. All these sources may require credentials to allow access (for example, a GitHub OAuth token or ssh key, or Docker Hub credentials). For CI/CD systems sitting outside existing infrastructure, this requires credentials to be stored in the CI/CD system; most systems have built-in support for storing credentials like this in a relatively secure manner. If the CI/CD system sits inside the main infrastructure, it may be possible to grant permissions to retrieve inputs from sources which also sit within the infrastructure, without needing to explicitly store credentials in the CI/CD system itself. This could be accomplished through the use of IAM roles granted to the container or VM hosting a CI/CD system in AWS. This is the preferred approach where possible, as it eliminates the risk of credential leakage.
Credentials are also usually needed for the CI/CD system to deploy changes to an environment. This can be accomplished as is detailed above. If the CI/CD system sits outside the main infrastructure ecosystem, credentials will need to be stored in the CI/CD system, carrying the risk that they could be leaked. If the CI/CD system is hosted within the same infrastructure as it is deploying to, it can usually be granted permissions indirectly, which is much more secure.
Running software sometimes needs secrets such as SSL/TLS certificates, and database or other credentials. It is sometimes tempting to have the CI/CD system pass these credentials to the deployed system, either from values stored internally or by retrieving them from an external secrets store of some sort. A better alternative, which reduces the chance of secret leakage, is to store the secrets in a dedicated service such as AWS Secrets Manager, Hashicorp Vault, or Azure Vault. The software can then retrieve the secrets itself, as close to the point of usage as is possible. The software must of course be able to retrieve these secrets, potentially requiring ambient permissions such as AWS IAM roles in order for it to authenticate with the secrets manager. The best solution is to avoid the need for secrets at all by relying entirely on permissions granted to the execution environment of the software, such as IAM roles. This is not always possible, but is the most secure option if it can be practically achieved.
Post-deployment smoke tests
As well as testing prior to deployment, there is often value in a small set of automated post-deployment smoke tests to validate the system works as expected in the production environment. The purpose of such tests is to check environment-specific factors which could cause the system to work differently than in test environments — differing configuration or any known infrastructure differences, for example.
Configuration as code
In modern CI/CD systems, configuration of the specific actions needed to build, test and deploy the software is usually achieved through the use of configuration files, often in YAML format. Older CI/CD systems, however, were configured using web interfaces. It is strongly preferable to use systems where virtually all configuration is in code. In fact, it can be argued that this is an absolute requirement, because it so effectively allows for best practices of version control to be applied to the CI/CD system itself.
These configuration-as-code systems usually treat the configuration on each specific branch in isolation, providing an easy way to test changes to the CI/CD pipeline itself on a branch before applying them to the main branch and thus altering the behaviour of the main build and deployment pipeline. The use of code for this configuration also has all the usual benefits of control (requiring peer review) and auditability, and it typically makes it easier to ensure the deployment mechanism to each environment is consistent.
Known build environment
It's important that each build runs in a known and consistent execution environment.
Many modern CI/CD systems run their tasks (build, test, deploy) in containers. This provides strong guarantees that the execution environment is a known state and is the same for each run. Some CI/CD systems allow the specifying of a custom Docker image for each task, giving even more control over and visibility of the task execution environment. This often allows builds to be faster by baking more of the needed tooling into the custom image. Container-based builds also make it trivial to run the build locally in exactly the same way in which it's run in CI. These systems give the best experience and are strongly preferred.
Some software-as-a-service (SaaS) systems support container-based execution environments restricted to a predefined set of images, while others provide VM-based environments and make their own guarantees that the environment will be consistent for each run. These SaaS solutions don't provide the confidence in the consistency and visibility of the environment that containers do.
Some self-hosted systems run tasks directly on (typically long-lived) worker VMs. These systems should be avoided as the worker VMs in these systems tend to accumulate uncontrolled changes over time, which are difficult to reproduce and test.
Tools vary in how clearly they present an overview of what version is deployed to each environment and the progress of each build. The best tools make it trivial to understand the current state of the CI/CD system and the state of all the environments to which it deploys.
There are many factors to consider when selecting and configuring a CI/CD system. For some of these there is a standard default approach or choice which works best for most projects. For others, the best choice depends the details of the individual project, and involves some degree of trade-off.