Skip to main content

Implementation and validation


This section is part of build and consists of taking refined backlog items and delivering validated software into production ready for operations.

This is a set of defaults for teams to use, but is not mandatory if teams have good reason to do something different (see What this is and is not).

Delivery process overview


  • Backlog items ready for implementation.


  • Software running in production.


Activities in this section are where the vast majority of the team's time is spent. Here we give recommended defaults for the day to day mechanics of delivering working software.


The effectiveness of a team is influenced more by its culture than the rigid application of specific processes. A team is made up of individuals and it is the way these individuals feel, behave and interact that defines the culture. It is up to every member of the team to actively promote the culture they want. Sometimes this means making time to take care of each other, and it certainly involves making time to build and reinforce relationships between team members.

See also the behaviours associated with the core skills.

Take time as a team to agree on the culture you want and any ground rules, and document these as your Working Agreement.

Example Working Agreement

Team first

No individual is more important than the team. The output of the team as a whole matters more than that of any one person. Every team member must be ready to put aside their own work to help others if that is what will deliver the right outcome.

Whole team

All work is visible and no-one works to personal backlogs. This ensures everyone has the visibility they need and is focused on the highest priority work.

Everyone in the team is involved in decision making, including product design, technology selection, technical design and prioritisation. This means that decisions are made in the open and that everyone has visibility and a chance to have their say. Whatever role they are in, those who have an opinion need to make a convincing argument to others while being open to the possibility that they are wrong.

Excellence with pragmatism

We strive to ensure technical quality while being pragmatic and doing what is right for the customer and the specific challenge.

We change our minds when we are wrong and change what we are doing when it is not working.

When a team member is unhappy about something, it is their responsibility to raise this constructively and help to find a solution.


The workflow determines which activities occur in which order for each backlog item. It is decided and evolved by the team, but a recommended default starting point is:


  • Next-up backlog
    • Ordered, refined backlog of items in the issue tracker waiting to be picked up for analysis and elaboration. See refinement.
  • Analysis and elaboration
  • Ready for implementation
    • Waiting area for items ready for implementation, so it is clear which items have completed analysis and elaboration.
    • This column is optional. In teams where analysis and elaboration is done immediately before implementation and by the same people who will implement it, this column is not needed. Whether this is achievable depends on people's availability (see Upfront vs just in time).
    • When this buffer runs low (say less than 3 items, but the threshold will depend on the team) this should trigger more analysis and elaboration work to ensure team members do not run out of items that are ready to implement. It is also recommended to set a WIP limit.
  • Implementation and validation
    • Create feature branch
    • Write production code and associated unit and other code-level tests
    • Create other automated tests
    • Initial UAT by the product owner
    • Local exploratory testing on a feature branch
    • Code review
    • Merge branch on successful build and test
    • Automated deployment to the dev environment
  • UAT
    • The Product Owner or SME validation on dev environment. This may not be needed for every item if there is confidence in validation on the feature branch.
  • UAT done
    • Waiting area for items ready to deploy to the staging environment. Not present if UAT stage is not needed.
  • Final validation
    • Final validation on staging before deployment to production, e.g. final UAT, load testing. A bare minimum should be done here, possibly nothing except by exception if there is confidence in earlier validation.
  • Ready for live
    • Waiting area for items ready to deploy to the production environment. Not needed if deployment to production is fully automated from code being merged to the main branch.
  • Live
    • Built software deployed to production or equivalent (e.g. to mobile app store).
  • Done
    • Item closed in the issue tracker.

Handling blocked items

Items can temporarily become "blocked" if they are unable to proceed due to external dependencies. The delivery process should encourage a focus on promptly unblocking items.

When items are blocked they should be left in position and identified as blocked with a flag or other indicator. Some teams instead use a blocked column, which does not work well: it loses the information about what stage an item is and takes focus away from moving it forward. Blocked items should still contribute to the WIP limit for the stage they are in.

Creating fast flow

A combination of process and technology techniques can help create fast flow, where items move rapidly, smoothly and reliably through the board.

  • Wait time
    • Wait columns can indicate waste, particularly if they represent hand-offs between people or teams. Excessive hand-offs slow delivery and often lead to degraded quality. Work to reduce hand-offs where possible.
  • Automation
    • The bulk of functional and non-functional testing should be automated.
    • Auto-formatting and linting makes code review quicker and easier.
    • Automated infrastructure and deployment makes the path to live slick.
    • See Benefits of automation.
  • Shift left and just-in-time
    • Most of the thinking about the fine-grained requirements for each backlog item, the product and technical design and how it will be validated should be done just in time before implementation. See Just enough, just in time.
    • Specifically, testing should be part of implementation rather than something that happens after implementation is done, and implementation should be approached with a "test-first" mindset, whether you use strict test-driven development or not. Tests should be written alongside the code they test and the code should be written to be testable.
  • Feature flags
    • Feature flags provide a way to dynamically change which features are available and how they behave without rebuilding the code.
    • By using feature flags to only enable partially-implemented features in test environments it can become safe to deploy continuously all the way to production, only enabling the feature in production once it is ready for use and has passed user acceptance testing.
    • This allows teams to decouple deployment of software from the release of new functionality (by enabling a feature flag dynamically). When used well, this technique can allow much more frequent and rapid deployment.
    • Choose how you implement feature flags carefully. You want to avoid littering the code with conditionals and want to allow flags to be switched easily but safely. Also, take care to avoid an ever-increasing number of flags as these make software harder to test and maintain.
    • See Coding with Feature Flags: How-to Guide and Best Practices for more information.

Benefits of automation

Automation has many advantages beyond the obvious potential for time-saving. But it takes effort to implement automation, so it is good to be clear on the specific benefits which justify the effort.

  • Reliability and consistency: automated changes are highly repeatable, which makes them more reliable than performing the same steps manually. Also, automation involves representing the process in code of some sort, allowing the usual peer-review process to be applied, which improves reliability.
  • Resilience: because representing the process as code means that the knowledge for how to perform the process is captured and cannot leak out of the team through members leaving.
  • Auditability: it is easy to create a log of automated changes as they are performed. Also, by storing automation code in source control, changes to the process can be tracked over time which helps to diagnose any issues which occur.
  • Availability: automated processes do not rely on someone being available to trigger them, they can happen whenever they need to.
  • Speed: machines can typically perform actions quicker than humans and they do not get distracted or bored, so automated processes are typically quicker though of course time is needed to automate processes initially.

WIP limits

It is recommended to apply WIP limits to all states in the workflow.

WIP stands for Work in Progress, and is a concept that comes from Kanban. This refers to an agreed limit on the number of backlog items that can be "in progress". This can be applied either to an individual state (such as Ready for implementation), groups of states together, or the total across all in progress states. When a limit is reached, no more items should be brought into that state or at least careful consideration must be applied before doing so. This encourages a focus on finishing in-progress items before starting new ones.

Having a lot of work in progress causes waste:

  • Context switching due to individuals sharing their time between multiple items, as has been well-researched by the American Psychological Association in Multitasking: Switching costs. This kind of task switching significantly reduces productivity and effectiveness.
  • Partially-done work that requires rework or is discarded because it has become out of date or no longer relevant due to being in progress but not actively worked on.


There should be a stand-up every day, with the whole team involved.

The recommended format is to "walk the board" from Live at the right to Next up at the left, discussing each item briefly:

  • What is needed to move this toward Done?
  • Is anything impeding progress?
  • Is any help needed?
  • Is there any way to break the item down?

Stand-up tips

  • Be on time
    • Stand-up should start on time, whoever is there.
    • Everyone should reliably be at the stand-up on time to avoid wasting the team's time or missing out on important information.
  • Take it in turns to lead
    • To a certain extent, stand-ups can self-organise. For example, it is common to say anyone can raise their hand to propose a discussion be continued after the stand-up. But it is still beneficial to have one person act as a facilitator to keep it on track.
    • Take turns doing this so that everyone builds engagement with the process.
  • Keep it brief
    • Discussion on each item should be brief and to the point. The aim is to focus on moving the item forward, not explaining what has been done on it.
    • Where a short discussion can resolve a blocker (such as clarifying a requirement or a solution decision) then it can be best to let that happen during the stand-up.
    • But where anything more in-depth is needed, the discussion should be postponed until after the stand-up in a smaller group.
    • Consider time-boxing the meeting.
  • Focus on time-in-progress
    • Stay aware of how long each item has been in progress and focus on items that have been in progress the longest.
    • Agree on a target maximum age as a team so you can all work toward the same goal and agree when items are staying in progress "too long". This is sometimes called the Service Level Expectation.
    • You don't usually need to track this precisely; a rough feel for how long an item has been in progress is usually enough.
  • Ask "What will the board look like tomorrow?"
    • This can help to focus attention on doing what is needed to keep items moving.
    • Don't ask how long it will take to finish an item as this leads to stand-up becoming just a status update or justification of why each item is taking the time it is.
  • Hold each other to the agreed WIP limits
    • To maintain focus on finishing items before starting new ones.

Watch out for

  • Make sure everyone is engaged and issues are raised and discussed (possibly after the stand-up), rather than each person just giving an update.
  • Ensure everyone feels safe to speak freely. Be wary of people from outside the team attending the meeting with the purpose of interrogating, as this will erode trust and lead to a reticence to raise issues openly and honestly.
  • Check that team members assign themselves to work when they are ready to start something new, rather than being assigned work. Be especially wary of work being assigned by the Product Owner or other stakeholders. The team should be self-organising, not micromanaged.


The recommended default is to use trunk-based development with short-lived feature branches and optional release branches, as described below.

Feature branches

  • The aim is to have relatively high confidence that the code on the main branch could be deployed to production. This does not mean that there should be no validation after the merge to main, but the bulk should be done before the merge.
  • Every backlog item should have its own feature branch, possibly more than one: it is OK to have a series of feature branches for a single backlog item. This keeps merges to the main branch small, and is fine so long as these can be made in a way that does not break the code on the main branch such that it could not be potentially deployed.
  • Feature branches should be short-lived, typically no more than 12 days.
  • The main branch should be configured to only accept changes via pull/merge requests, which should require code review approval and a "green build", with all automated tests and checks passing. Rules should apply to everyone, including administrators and team leads. New changes on branches should require a re-approval.
  • Avoid force pushes to feature branches: you do not know who may have pulled the branch and the risk of confusion is usually not worth it.
  • Break changes down into a series of small commits, typically several commits per hour.
  • Default to squash-merging using the built-in facility in your source control system. The details of what happened within a short-lived branch are not generally of interest, and worrying about this can lead people to batching work up and committing less frequently, which is not desirable. If you want to keep the separation between unrelated changes in the version history, then split the changes into two branches to be merged separately.
  • Branches should be named so they can be clearly tied back to the associated backlog item. It is good to also include a brief description of the change being made in that branch for easy reference. For example, a branch for backlog item XYZ-457 could be called xyz-457-add-price-sort
  • Commits should have a commit message that identifies the backlog item it relates to, e.g. feat(XYZ-457): add price sort mechanism in product search.
    • You should connect your issue tracker to your version control system (VCS) to associate commits with backlog items by parsing commit messages, as it provides traceability of requirements.
    • You should consider adopting conventional commit syntax, which allows you to:
      • Standardise commit message formatting and apply git commit hooks to enforce it.
      • Identify types of work by reading commit messages refactor, fix, feat, and chore.
      • Automate the creation of release notes, categorised by a bug, feature etc., using a conventional changelog.

Release branches

If the team has built high confidence in the validation done on each feature branch before merge, it can be beneficial to deploy straight from the main branch, following a "fix-forward" approach if any issues occur (see CI/CD). This is an enabler for more frequent deployments and is the preferred option when possible.

When this is not possible, more involved validation of the specific version planned for deployment may be needed. In this situation, release branches allow work to continue to be merged to the main branch while the release is validated.

Release branches should be created from the main branch just in time and should be deleted soon after the release. If any problems are found with the release candidate the fix should be first applied to the main branch via a feature branch and pull/merge request, and then cherry-picked from the main branch to the release branch.

Trunk based development


Code and code-level tests

Essential reading: Secure Engineering.

Production code and the associated unit- and other code-level tests should be written together. In some places, strict Test-Driven Development (TDD) will be beneficial, but often it is more effective to let the production code lead the tests just a little to avoid churn. The important thing is to write production code with testing and testability in mind, and to ensure the tests are in place as part of producing the functional code.

It is recommended to use an automated linting tool to ensure standard code style and to detect and avoid common errors. Where practical this should be applied both in real-time integrated with your code editor and as one of the automated checks before code can be merged to the main branch.


Paired working, whether pair programming, dev-test pairing or any other flavour can have many benefits and is recommended for at least some work.


  • Shares knowledge between team members.
  • Helps to standardise practice, leading to more consistent code.
  • Improves the quality of solution design and implementation.

Pairing is particularly beneficial in the early stages of forming a team and to introduce new members to the team. However, it is not necessary for all work, especially where patterns have been largely established. As a rough guide, on average an even split between paired and solo work is often appropriate. It can be appropriate to start work on an item solo and call in a pair, or the other way round. Be pragmatic and do what works.

Code review

All code should be reviewed by another member of the team who must approve the changes before they can be merged. A code review involves another member of the team looking through a proposed code change and providing constructive feedback.

Many teams consider it unnecessary to have a separate pre-merge review for code that has been written as a pair. This is because they see peer review as an innate bi-product of pairing.

Robert Fink provides an excellent description of the motivation and practice of code reviews. Some key points from this and other sources (Google, SmartBear, Atlassian) are:

  • Egalitarian
    • With the right (basic) training, anyone in the team can review anyone else's code with no hierarchy.
    • Everyone's code must be reviewed, no matter how experienced they are.
  • Small
    • Code reviews should be relatively small as it is hard to review very large changes effectively.
    • This is one reason to break stories down as small as is practical and to implement each story incrementally, ensuring no single change is too large to be reviewed well.
  • Meets user needs While effective testing is the best way to detect bugs or non-functional problems, code review plays an important role in spotting potential issues:
    • Does the code look like it will meet the acceptance criteria, or are there obvious errors or omissions?
    • Does it handle edge cases?
    • Are common issues guarded against relating to security (see Secure Engineering), performance, scalability or robustness?
  • Of high quality
    • Is the code clear and simple?
    • Is the code layout and structure consistent with the agreed style and other code?
    • Would it easily allow future modification to meet slightly different needs, e.g. ten times the required data size or throughput?

Non-code-level automated tests

A targeted set of non-code-level automated tests should complement the code-level tests to provide a good level of confidence in the system.

Code-level unit tests and integration tests provide confidence that the code within a single deployable component works as intended. Such tests are usually insulated from external dependencies such as the database and other deployed components. In some cases, such tests can interact with "real" external dependencies, but the use-case for this is limited, and these integrations between components are usually best tested in other ways.

These tests should:

  • Minimise duplication of things already covered in code-level tests.
  • Focus on integrations and interactions, rather than covering every variation.
    • Code-level tests are better for covering details of business logic, including the variation needed to test different cases and edge cases.
    • Consider separately testing the contracts between components and the functionality of each component. Contract testing can either be done with a tool like Pact or with standard unit testing libraries. When testing individual components they should be isolated from external dependencies unless you are testing specific integrations.
  • The guiding principle is to keep your tests lean and limited to what is needed to gain adequate confidence. There is an overhead to having unnecessary or low-value tests in terms of maintenance and occasional false positives, so only keep the tests you need.

Non-functional tests

  • Security testing should largely be automated as part of CI/CD. See Secure Engineering.
  • Similarly, accessibility testing (where applicable) should be automated as part of CI/CD.
  • Load tests should be scripted and either automated or at least run regularly. See Load Testing.

Exploratory testing

Targeted exploratory testing is an important part of validating changes. Effort should be guided based on a deep knowledge of the various automated tests already in place. Some exploratory testing will be done by the engineer(s) implementing each change and more should be done by someone with a "test mindset", whether that is someone in a dedicated test role or another engineer fulfilling that role for this change. It is recommended to do the bulk of exploratory testing before code is merged by running the code locally or by deploying it to an ephemeral environment. The aims are to answer these questions:

  • Did we miss anything in the requirements for this item?
  • Do the changes make sense in context?
  • Did we miss anything in the automated tests?

Suites of repetitive manual "regression" tests are not necessary with proper automated testing and should be avoided.

User Acceptance Testing (UAT)

The recommended default is to perform user acceptance testing (UAT) for every feature before it is considered done. All products ultimately serve users and some form of UAT is always relevant. If feature flags are used (see Creating fast flow) a feature may consist of multiple backlog items and may be performed separately to the deployment of individual items.

User acceptance testing is usually performed by the Product Owner or delegated to a Subject Matter Expert (SME). The aim is not to find bugs, which should have been detected in automated and exploratory testing. Instead, the focus is on verifying that the product hangs together with the changes in context, including whether features are easy to understand and use.

Technical debt

It is important to keep on top of technical debt. If it is allowed to accumulate it can be very difficult to get it under control.

The best approach is a combination of improving things as you go and capturing improvement work in backlog items and prioritising them as part of your usual work. See Continuous Improvement.

Create tech tasks in the backlog to represent tech improvements that cannot be dealt with as part of delivering a feature. Work with the product owner to put these in the right order in the backlog, interleaved with user stories and other backlog items. This discussion will require you to explain and quantify the benefit of making each tech improvement. For example, "will reduce the chance of bugs in the checkout process" or "will improve delivery speed in the product search API".


Deployments should be fully automated using infrastructure as code such as Terraform or CloudFormation and triggered using a CI/CD system. Every deployable component should have its own CI/CD pipeline so that each can be individually deployed. CI/CD pipelines should be written as code, typically in YAML files.

For more background see:

The recommended steps are:

  1. Run unit and other automated tests.
    • Trigger on push to a feature branch when there is an open pull/merge request and on merge to the main branch.
  2. Build a potentially-deployable artefact and store it.
    • This single artefact should be used at every stage of the pipeline with no rebuilds to avoid the risk of testing subtly different artefacts at each stage.
  3. Automated deploy to dev environment on a successful build.
  4. (Optionally) run automated post-deployment smoke tests on dev.
  5. On manual trigger, deploy to staging.
  6. On manual trigger, deploy to production.

CI/CD tips

  • Address build failures immediately
    • False or intermittent failures reduces confidence in the tests and can lead to genuine bugs being missed or to wasteful "safety net" processes on top.
  • Make failures clear and concise
    • Build failures should be easily available, clear and concise. Invest time in reducing unnecessary noise in the build output.
  • Favour fix-forward
    • Avoid roll-back scripts in favour of being able to quickly build and deploy a fix. Configure meaningful health checks and make use of the features of many serverless and container orchestration systems to validate a deployment as healthy before directing production traffic to it.
  • Run pipelines regularly
    • Even if you haven't made any changes! Augment builds triggered by code changes with regular scheduled builds, default overnight. There may have been changes to shared code, infrastructure patches, dependencies, etc. It is vital to stay up to date, and to discover any issues as soon as possible.