Skip to main content

Refinement

Introduction

This section is part of build and consists of breaking down epics from the plan ready for implementation and validation.

This is a set of defaults for teams to use, but is not mandatory if teams have good reason to do something different (see What this is and is not).

Delivery process overview

Inputs

Outputs

  • Backlog items ready for implementation.

Overview

Refinement is best done little and often, and the default is to have a session once or twice per week (see Upfront vs just in time). The whole team should be involved in these sessions to ensure everyone has good visibility of what is coming up and has a chance to input. The goal is to confirm items are understood and ready for implementation. Work in smaller groups is usually needed in preparation for these sessions to add detail to items ready for review.

Refinement takes two inputs: epics from the epic roadmap, and backlog items generated in previous refinement sessions or during implementation. Refinement aims to take these inputs and generate a set of backlog items that are ready for implementation.

Iterative, not incremental

Focus on iterative delivery, rather than incremental, as described by Jeff Patton. He illustrated the difference using a painting analogy.

Incremental delivery means starting with a fully-formed idea of what the finished product will look like and delivering that bit by bit. This approach limits the amount of useful feedback which can be gained until later in delivery, and is not recommended.

Incremental

Iterative delivery, by comparison, means starting with a rough idea of what the finished product will look like and gradually refining that idea in response to what is learned during delivery. Early in delivery, you create a product that meets the basic goals, and it gets richer and more complete as more detail is added across the whole product, guided by what is learned during delivery. This approach maximises early learning and is recommended.

Iterative

Slicing epics

Epics from the epic roadmap must be sliced into individually demo-able backlog items which represent vertical slices of functionality. Start by capturing just the title for each item in a lightweight way such as using a whiteboard, sticky notes or an online equivalent such as Miro. This step drives iterative delivery by helping identify the three categories of the item contained in the epic being broken down:

  1. Clearly essential for the feature to make sense. Everyone agrees that you cannot deliver this feature without these items, so they must be in scope.
  2. Clearly "nice to have" embellishments. It is easy to agree that these slices of functionality should be taken out to form a new, lower priority epic and this should be done.
  3. Considered essential for the feature to be "successful". This is a subtle but important distinction from the first category and these are where the debate around "minimum viable" scope usually centres. Although these may be considered essential for a "successful" product, if the product would at least basically function without them then they should be taken out to form a new, lower priority epic. This allows the delivery team to move on more quickly to items from other epics which are clearly essential for the product to make sense, and are therefore a higher priority for the product as a whole than these items. Doing this does not mean that these items will not be implemented before launch, but it does buy options.

Once the set of items in scope for the epic has been agreed upon, add more detail to each (as described in the following section) to get them ready for implementation.

Add detail to backlog items

For a backlog item to be ready for implementation it must have enough detail.

There should be enough detail to avoid items routinely becoming blocked because they are missing information that can only be provided by people who are unavailable (see Up front vs just in time).

On the other hand, an item should not have any more detail than necessary to avoid waste:

  • Excessive detail risks presupposing decisions that are better made as part of the implementation. This can mean the effort taken to create that detail is wasted or (worse) that effective implementation is hampered by sticking with suboptimal decisions.
  • Any effort in adding detail is only useful if that item goes through implementation and is ultimately released to production. Despite best efforts to prioritise the right items, and to only add detail just in time before implementation, sometimes things change and items end up being de-prioritised. When this happens, the effort of adding detail is wasted.
  • Title, of form "Role can action".
    • e.g. "Unregistered user can search by title".
  • Summary of what is wanted, by whom and why.
    • e.g. "Unregistered user can search by title to find items they know the name of".
    • The exact format does not matter so long as it includes an indication of who has the need, what they need to do and why.
  • Visual designs
    • Where visual designs are required, these should either be attached to, or referenced from, the item as part of adding detail or created as part of the implementation, depending on how the team works (see Up front vs just in time).
  • Acceptance criteria stated as clear boolean "tests".
    • e.g. "When I search for 'dog' or 'Dog' I am shown all items whose title contains the character sequence 'dog', regardless of capitalisation and where these characters appear in words".
    • e.g. "When I search for 'dog' I am shown items whose title contains the word 'dog' prioritised over instances where those characters appear as part of another word".
    • The exact format does not matter so long as it specifies the action and expected outcome, and if relevant a context (e.g. "Given I have searched for dog, when I select 'sort alphabetically'...").
  • Test approach as a set of bullet points.
    • This should not state the obvious, for example, it is assumed that all code will have appropriate unit tests so this does not need to be stated.
    • Where it is not obvious which technique to use then it can help to make a provisional decision before diving into implementation.
    • For example, checking that search is case-independent involves testing several variations and can be effectively and most cheaply done in code level (e.g. unit) tests. But testing that the selected sort order is respected may include (depending on implementation details) verifying correct integration from UI down through the stack and may be best tackled with an automated UI test.
  • Technical approach as a set of bullet points.
    • Again, there is no need to state the obvious but where there are decisions to make it can help to at least do so provisionally before implementation. This allows decisions to be validated by peers before implementation starts and effort has potentially been wasted, and also helps provide the information needed so that any team member may pick up the item for implementation.
    • For example, sorting of search results could be done in the database, in the back-end logic or the front end. There are different trade-offs for each and which to choose will depend on the situation. A decision like this is worth making before starting to code.
  • Title, of form "Error when context".
    • e.g. "Confusing message when username already exists".
  • Severity
    • e.g. critical, major, minor, trivial.
    • Clearly define what is meant by each of these severity levels to ensure they are consistently understood.
  • Environment where the bug was found, e.g. dev, staging, production.
  • Version number/ID of the system where the bug was found.
  • Steps to reproduce
    • e.g. "Attempt to register with a username which is already in use".
  • Expected result
    • e.g. "There is a clear indication of what the problem is so the user knows what to do next."
  • Actual result
    • e.g. "The message '500 Internal Server Error' is shown."
  • Evidence e.g. screenshot.
  • Title, of form "Benefit" or "Benefit, e.g. by possible solution".
    • e.g. "Improve visibility of test failures" or "Improve visibility of test failures e.g. by slack alert on shared channel".
    • Focus on the benefit, but where adding a possible solution adds clarity it can be helpful to include that too.
  • Problem to be solved
    • e.g. "Test failures are not visible to team, meaning they are not detected promptly".
  • Acceptance criteria, as for user stories.
  • Test approach, as for user stories.
  • Technical approach, as for user stories.

Individually demo-able backlog items

Backlog items should be individually deliverable and should individually add value that can be demonstrated. This value may be directly visible to internal or external users or may be in the form of new learning for the organisation.

The mnemonic INVEST is used to express the properties backlog items must have to be ready for implementation:

  • Independent. As far as possible this item does not depend on other items so that items could in principle be delivered in any order. This allows items to be prioritised independently and (along with Valuable) ensures items represent vertical slices of functionality. The main allowable exception is where one item establishes a steel thread that others rely on. In this case, it must be clear which the leading item is and which items depend on it usually, careful naming provides an adequate indicator.
  • Negotiable. The formation of backlog items is iterative: it starts when forming the epic roadmap when a very rough (and disposable) sketch of the likely backlog items is used to break the delivery down into epics. During refinement, this activity is repeated as the first stage of slicing epics. As detail is added, negotiation over the exact requirements for each item continues, and as more is learned during implementation it is even possible that the requirements may be tweaked further, often by slicing an item into smaller items or delivering something equivalent (or better!) but easier to implement.
  • Valuable. Each item must add user value. It does not matter if by itself this item does not add much value that a user would recognise, but it should visibly move in that direction. For example, a backlog item to create a placeholder product detail page that shows only the product name would be a visible step toward a basic but usable page, and would be considered valuable. This is called a vertical slice because it involves adding a little to every layer of the stack, from backing storage through back-end logic and to the UI. By contrast, implementing only a horizontal slice such as the backing storage of product details but not surfacing that in an externally visible way would not be considered valuable because there is no value that can be perceived from the outside.
  • Estimable. Even though you may not explicitly estimate backlog items, each should be defined in enough detail so you could in principle decide an estimate. If you could not estimate it then the item is too poorly defined to be ready for implementation. If the reason is that the requirements are not well known then this is usually a sign that the item should be sliced. If it is due to technical uncertainties it is often best to introduce a spike to rapidly explore options before committing to implementing the item, which should not be brought into implementation until the spike has provided confidence in the implementation approach.
  • Small. Within reason, smaller items are better for several reasons. See Slice items.
  • Testable. Testing is an essential part of implementation and validation. An item must be specified in enough detail so that it is clear how it will be tested. If it is not clear then the item is not well enough understood to be ready for implementation.

Slice items

Smaller items are:

  • Quicker to implement, freeing team members to pick up the next item from the top of the backlog, thereby buying options.
  • Quicker and easier to test effectively.
  • Quicker and easier to review the code effectively.

There should not be a large overhead to splitting items into vertical slices and delivering them as several smaller, independent items. If there is, then it is usually due to insufficient automation or waste, which should be addressed (see continuous improvement). However, if splitting functionality out of an item will not make it quicker to implement then it is probably a sign that it has already been split as small as is useful.

For example:

Start with an item called "Customer can provide shipping details". Slice this into several items:

  • Customer can enter basic delivery details
    • Details specified in the backlog item: delivery name, whether items should be gift wrapped and any special instructions.
    • Since these are all simple text entry or checkbox fields we expect them to be quick and straightforward to implement and validate, and splitting this further is not beneficial. If during implementation we discover unexpected complexity in any of these areas then we have the option to split at that stage.
  • Customer can enter contact number
    • Input validation is needed which creates extra effort, as do edge cases which expand testing.
  • Customer can enter email address
    • Input validation is needed which creates extra effort, as do edge cases which expand testing.
  • Customer can enter postal address
    • Involves a postcode lookup, which requires an external integration.
    • This item may in fact be worth splitting further if this is the first integration with a postcode lookup service.
  • Customer can select delivery service
    • Requires a lookup of available services and associated costs based on postal address.

Estimating backlog items

Estimating epics is essential, as discussed in Planning. But whether and how to estimate individual backlog items is more controversial.

The recommended default is to work toward not explicitly assigning estimates to backlog items, instead focusing on splitting them into thin vertical slices. If items are split small enough then the remaining variation in size is not worth quantifying.

However, teams often struggle with slicing stories finely and it is a skill that takes practice to develop. A good way to develop this skill and to set expectations in the team is the Elephant Carpaccio exercise, and all teams are encouraged to use this.

Where estimation is required by the customer, use story points (see What Are Story Points? and Planning Poker). If you can, limit the allowed estimate values to at most 1, 2, 3, 5 and 8, forcing an item to be sliced if larger than that. Over time, aim to shift the focus from estimation to slicing items small enough that estimating becomes irrelevant. Try reducing the allowed estimate range every couple of iterations until all you have left is 1!

Any discussion of estimates should note some important caveats:

  • Estimates are a forecast of the most likely outcome, not a commitment by the team or a guide for team members on how long to spend on an item.
  • Estimates go stale. Any estimate attached to an item should be refreshed if it was made some time ago. Since epics should only be broken down into backlog items just in time, there should not be a large number of items with old estimates, but it can happen and is something to be wary of.

Prioritising the backlog

The product owner has sole authority and responsibility to order the backlog, though they should take input from the whole team and outside the team in doing so. A structured approach to prioritisation can help. In abstract terms, the priority of an item is a factor of the value it will generate and the effort involved in implementing it. Another important aspect to consider is keeping the number of in-progress epics from the roadmap the "Epic WIP" low. Aim to order the backlog to focus on finishing in-progress epics before starting new ones, but be ready to slice an in-progress epic if it is realised that some parts of it are lower priority than initially thought.

Validate the process

Prioritisation is likely to be working well when:

  • It is straightforward to connect deliverables to business goals and strategy.
  • Stakeholders see value in the deliverables shown during iteration review.
  • Actionable metrics improve.

Watch out for

Ownership and priority of the backlog is a frequent source of anti-patterns, as many organisations with traditional structures lack a good understanding of the Product Owner role. Beware of:

  • Be wary if it appears your Product Owner lacks the required authority and autonomy and is acting as a "proxy product owner". The source of this disempowerment will often be from outside the team, because decision making sits with a different stakeholder, but it can come from within the team if decisions are effectively made by a different team member. Regardless, if you see this happening it is important to understand the problems it causes and to work to resolve it.
  • Make sure the focus stays on outcomes, not outputs. That is ensure prioritisation is based on generating valuable deliverables, not keeping people busy.

Buying options

Being ruthless in slicing epics and slicing items and when prioritising allows you to "buy options".

A laser-sharp focus on doing just enough to make a feature work and then moving on to do the same for the next feature is the quickest route toward a basic but fully functioning system. As soon as you get there you can start to get truly valid feedback from users on where to invest more effort. It might be in exactly what you originally thought were the highest value areas, but it might be in areas you initially assumed were of lower value or even areas you had not thought of. By racing toward this truly minimally viable product you have accelerated your ability to act on feedback.

Covering the basics first also means you are in a better position to respond to unexpected changes such as moving deadlines or changing team size. Whether the idea appeals to you or not, you do at least have the option to allow users to start using a basic but complete system. A shopping site with a perfect search function but no way to check out is not viable and if you run out of time you have no product.