Techniques for Accurate(ish) User Story Estimates
TL;DR Accurate user story estimation is hard — done well, teams can become consistent and predictable, resulting in more successful sprints and better long-term planning capabilities. Use techniques such as planning poker, reference user stories, and hyper-splitting to improve the accuracy of your team’s effort estimation — or don’t estimate at all (#noestimates, within reasonable bounds).
Imagine for a moment that someone asked you to estimate the number of skittles contained in the jar pictured above. How would you answer? To provide an accurate estimate, you’d probably want to know details like:
- How big is the jar (diameter and height)?
- How big is each skittle?
- How should I account for the gaps between skittles?
- Is there anything hidden inside the jar that I can’t see?
We’ve all played this game before — probably back in grade school when our number sense was a little wonky and we gave answers like “six” or “a bajillion”. This popular children’s game, known as the estimation jar, offers lessons about the difficulty of accurate estimation and establishes techniques for compartmentalizing the problem to achieve good enough results.
With a bit of imagination, some loose parallels can be drawn between this silly game and work as a developer within the Agile Scrum framework.
- The Jar = Team Velocity: Teams have a fixed capacity, or amount of work they can complete within an ideal two-week sprint. This capacity is based on factors like the number of developers, their experience level and familiarity with the technology, their understanding of the business context, corporate and social environments in which they operate, the processes that govern their work, etc. Capacity can change over time as teams mature their knowledge and processes, team members come and go, etc. but is assumed to be constant and reasonably known when planning an upcoming sprint.
- Skittles = Work Items: During Sprint Planning, teams fill their jar with user stories, discovery spikes, bugs, tasks, etc. until they’ve reached their capacity. In the development world, the skittles will vary in size — it’s the job of the Product Owner to optimize the mix of work to deliver maximum value in the sprint while staying within the team’s velocity bounds.
- Other Items = Unknowns: What if, buried within the skittles, lie other items that can’t be seen without dumping out the jar’s contents? These items represent the uncertainty inherent in everything we do. Some stories end up taking more time than originally anticipated, bugs arise that require immediate attention, dependencies fall through, we realize more discovery work is needed to learn about a particular topic, and in general things just never go as planned. Some of these scenarios bloat the size of skittles in the jar while others surface as unforeseen risks and blockers, each of which impacts the amount of work a team is able to ship in a sprint and can cause volatility in their velocity trend.
Estimating complex, uncertain processes is hard. Understanding the motivation behind estimates, what influences them, and techniques for improving them can help move teams along the path toward achieving more consistent results.
Why estimate user stories anyway?
Assigning an effort estimate to user stories helps teams measure and maintain a constant, sustainable development pace indefinitely (Agile Principle #8). Good estimates enable teams to:
- Determine a Team Velocity: Tracking the number of story points shipped each sprint allows teams to gauge their improvement over time as they analyze the peaks and valleys during Sprint Retrospective ceremonies and identify ways to optimize their processes.
- Plan Better Sprints: With a stable team velocity established, teams can confidently plan sprints based on a known capacity rather than use gut-feel to determine when enough work has been pulled in. Good user story estimates and a reasonably accurate velocity help avoid the trap of overcommitting too much work each sprint, which can negatively impact team trust, both within and outside of the team, and morale if done on a consistent basis.
- Predict Future Performance: Teams are often asked to look beyond the next sprint to plan their work on a monthly, quarterly, or yearly basis. An estimate of the work effort, combined with a stable team velocity, helps provide the long-term clarity leadership often strives for.
What impacts estimates?
There are a myriad of factors that affect user story effort estimates, from familiarity with the technology to external dependencies to the specifics of a team’s Definition of Done, etc. Given the incredibly complex work environment in which teams operate, it’s easy to see why providing accurate estimates is so difficult.
What is “good-enough”?
Understanding the level of accuracy an estimate should provide is critical to gauge what “good-enough” means within the context of the environment your team operates. Teams held to a tight deadline, perhaps due to impending legislation, contractual requirements, or business seasonality may require more refined estimates than a team operating in an environment where long-term planning accuracy isn’t as valued, perhaps in favor of fast iterative development or when the end-goal isn’t as clearly defined.
Establish this understanding with your Product Manager & Owner to ensure your team isn’t wasting unnecessary time chasing estimation accuracy that isn’t needed. Try tying confidence intervals to your estimates with a range of delivery dates (ex. — P50 and P90 estimates) to indicate the uncertainty inherent in them, especially when estimating large work items like epics, initiatives, etc. or work that is slated to be completed far into the future (3+ sprints out).
When are user stories estimated?
Effort estimation is the final step in the user story creation process. It takes place after a detailed description, acceptance criteria, and test cases (if subscribing to Test Driven Development) have been documented and a team discussion has been held either during Backlog Refinement (preferred) or Sprint Planning. Teams should reach consensus regarding their understanding and acceptance of the deliverables of each user story before engaging in an estimation exercise — if there are obvious opportunities to split the user story into smaller chunks, do so now. Establishing this mutual understanding helps ensure that each team member’s estimate accounts for the same set of complexities.
Improving User Story Estimation Accuracy
Try the following techniques with your team to achieve more accurate user story estimates and establish a more stable, consistent velocity:
- Let Developers Do the Estimating: Unless you are a Product Owner or Scrum Master that also works on user stories, refrain from vocalizing an estimate. Instead, allow each team member to independently provide their estimate — if their responses surprise you, seek to understand why. Is there something the team is not accounting for in the acceptance criteria or technical complexity? Is the system being over-designed? Refrain from phrases like “I thought it would have been 2 story points” as this can encourage developers to change their estimates to accommodate your beliefs, often to a lower value in favor of pulling more work into the sprint.
- Play Planning Poker: A game intended to help facilitate blind estimates, Planning Poker discourages pushing estimates onto a development team by a Product Owner or Scrum Master. Instead, each estimate is hidden until all team members have voted. Once responses are revealed, if there is a large discrepancy between team member’s estimates, a discussion is held to understand everyone’s viewpoint. The team then re-estimates and continues this process until consensus is achieved (note — if individual estimates are off by a single fibonacci sequence value, try accepting the highest value and move on rather than spending time debating small details). Some free resources to help facilitate games include planITpoker and Scrumpoker Online.
- Use T-Shirt Sizing: Take the numbers out of estimation by assigning a visual scale to each story point in the fibonacci sequence. This helps dramatize the increasing scale between sizes and can help teams perform better comparative estimation. There are many different scales teams can use if t-shirts bore you — like this one — or establish your own. Regardless of the scale you choose, document it somewhere and make it visible during estimation exercises. After a size has been chosen, convert it to the equivalent story point value.
- Use Reference Stories: Pick a couple of user stories the team has completed in the past that represent the complexity that everyone agrees a 2-point and 5-point story should represent. During Sprint Planning, reference these and ask whether the user story in question is a lower, equivalent, or greater effort than the reference. This form of comparative estimation is a powerful tool to quickly arrive at an accurate-enough estimate based on historical work the team has done. Further, using two user stories spaced by at least one number in the fibonacci sequence helps to triangulate a reasonable estimation more quickly. For more complex systems, try identifying reference stories for each component that more accurately reflect the effort for that specific piece of the system. During your Sprint Retrospective, regularly review the usefulness of your reference stories and update as necessary to reflect the current state.
- Hyper-split User Stories: It is not uncommon for the scope of a user story to be too large to reasonably understand all of the complexities involved with getting it to a “done” state and provide a reasonably accurate estimate. Hyper-splitting of stories refers to the practice of splitting all user stories until they are small enough to be completed in a few days or less. Smaller user stories help reduce cognitive load and, as a result, are easier to accurately estimate. They also tend to move through each phase of the software development lifecycle faster than larger user stories, resulting in reduced sprint risk through delivery of smaller chunks of functionality more frequently.
- Review Inaccurate Estimates: During the Sprint Retrospective, discuss any estimates that were significant departures from the team’s original prediction. Determine if a better reference story should be established and other reasons the initial estimate may have been inaccurate. Brainstorm a few ways the team can improve and pick one to focus on in the next sprint.
- Don’t Estimate At All: Yup — I said it. You won’t find it in many books, but there’s a faction within the Agile community who subscribe to the #noestimates belief that estimation is unnecessary. Teams often spend a lot of time debating user story estimates — time that could have been spent working on the user stories themselves. Rather than tracking story points, #noestimates teams track the number of user stories completed in a sprint, and place an emphasis on tackling the next most important user story next. This operating model is typically reserved for mature teams that are extremely adept at splitting user stories into very small chunks and have achieved a consistent, stable output. They still hold themselves accountable for getting work done, and measure what they ship over time — it’s just done in a different way than traditional Scrum teams.
Each team’s goal should be to discover and deliver the best user experience through constant iterative development. It’s up to them to decide how to realize that vision — some teams prefer to plan their work by assigning story points while others opt to split user stories until they’re “small enough” and tackle as many as they can each sprint. Regardless of how the work is planned, teams should never be judged by the number of story points or user stories completed in a sprint; instead, they should be held accountable for the value delivered and happiness of their users. User stories and story points are not a representation of value — the proper alignment of the right work at the right time with user needs is.
ANSWER: The actual number of skittles in the jar was 5,192. How‘d you do?
Try these techniques out during your next Backlog Grooming or Sprint Planning ceremony and let me know what you think — I’d love to hear your experiences (good or bad) in the responses!