Managing and Refining a Product Backlog for Agile Project Management

This skill teaches you how to build, prioritize, groom, and maintain a living product backlog so your agile team always has a clear, ordered queue of work ready for sprint planning.

Start by capturing every feature, bug, and improvement as a backlog item with a clear user story format. Prioritize items by business value, urgency, and dependencies. Hold regular refinement sessions where the team breaks large items into smaller stories, writes acceptance criteria, and adds effort estimates. Keep the top 2-3 sprints worth of items refined and ready, while leaving lower-priority items loosely defined until they rise in rank.

Outcome: You produce a continuously ordered, well-groomed backlog where the top items have clear user stories, acceptance criteria, and effort estimates, enabling your team to pull work into any sprint with zero ambiguity.

Jun 1, 2026

Synthesized from public framework references and reviewed for accuracy.

ProductIntermediate2-4 hours for initial backlog creation, then 1-2 hours per week for ongoing refinement

Prerequisites

Basic understanding of agile principles and sprint-based delivery
Familiarity with user story format (As a [user], I want [goal], so that [benefit])
Access to a backlog management tool (Jira, Linear, Shortcut, Notion, or even a spreadsheet)
Awareness of your product's strategic goals and target users

Overview

A product backlog is the single, ordered list of everything a team might build, fix, or improve. It is the backbone of any Agile delivery process. Without a healthy backlog, sprint planning becomes a negotiation session, developers start work without clear definitions of done, and stakeholders lose confidence that their priorities are being addressed. Managing a backlog well means the difference between a team that ships predictably and one that thrashes between half-finished initiatives.

This skill covers the full lifecycle of backlog management. You will learn how to capture raw ideas and requests, translate them into well-structured user stories with acceptance criteria, estimate the effort required, and maintain a living priority order that reflects your current business reality. The concrete artifact you produce is a prioritized backlog where the top section (roughly 2-3 sprints of work) contains items that are refined, estimated, and immediately actionable, while the middle and bottom sections hold progressively rougher items awaiting their turn.

Backlog management is not a one-time activity. It is a continuous discipline that connects product strategy to daily execution. A well-managed backlog absorbs change gracefully: when a new opportunity or urgent bug appears, you slot it into the right position rather than blowing up the current sprint. When stakeholders ask "when will feature X ship," you can point to its position in the backlog and the team's velocity to give a data-informed answer. This skill sits upstream of sprint planning and downstream of strategic planning, serving as the translation layer between what the business wants and what the team builds next.

The most common failure mode is neglect. Teams create a backlog at the start of a project and then let it rot. Items pile up without prioritization, stories lack acceptance criteria, estimates go stale, and the backlog becomes a graveyard of good intentions. Mastering this skill means committing to the rhythm of refinement and treating the backlog as a living document that gets better every week.

How It Works

A product backlog works on the principle of progressive elaboration. Items at the top are small, detailed, and ready for immediate development. Items in the middle are moderately defined. Items at the bottom are large, vague, and speculative. This gradient exists because investing time in detailing items that may never get built is waste, while pulling under-defined items into a sprint creates confusion and rework.

The ordering of a backlog is not a simple ranking by business value. It is a composite judgment that weighs four factors: the value the item delivers to users or the business, the urgency or time-sensitivity of the item, the dependencies and technical prerequisites that constrain sequencing, and the risk or learning the item unlocks. A high-value item may sit below a lower-value item if the lower-value item unblocks three other pieces of work. A time-sensitive regulatory change may leap above a high-value feature because missing the deadline has severe consequences. The product owner holds accountability for this ordering, but effective product owners make these trade-offs transparently with input from engineering, design, and stakeholders.

Refinement (sometimes called grooming) is the recurring ceremony where the team inspects upcoming backlog items and makes them sprint-ready. During refinement, the team does four things: they break large items (epics) into smaller stories that can be completed within a single sprint, they write or sharpen acceptance criteria so the definition of done is unambiguous, they estimate effort using a relative sizing method like story points or t-shirt sizes, and they surface dependencies or risks that might block execution. Refinement is not planning. The goal is not to assign work or commit to timelines. The goal is to ensure that when sprint planning arrives, the team can pull from a queue of well-understood items rather than spending planning time debating scope.

The backlog also functions as a communication tool within the broader Agile framework. Stakeholders can see where their requests sit in the order. The team can see how much refined work is ahead. Leadership can track themes and initiatives across multiple items. When the backlog is healthy, it creates alignment without requiring constant meetings. When the backlog is unhealthy, usually because it is bloated, poorly ordered, or full of ambiguous items, it generates confusion and erodes trust between product and engineering.

One mental model that helps: think of the backlog as an iceberg. The tip above the waterline (top 15-20 items) is crisp and clear. The mass below the waterline is intentionally blurry. Your job is to keep melting items upward through refinement at the same rate the team consumes them through sprints. If refinement falls behind consumption, the team starves for ready work. If refinement races too far ahead, you are over-investing in items that may change before they are built.

Step-by-Step

Step 1: Capture everything into a single list
Gather all known feature requests, bugs, technical debt items, experiments, and improvements into one backlog. Pull from every source: stakeholder emails, customer support tickets, sales feedback, engineering wish lists, and strategic planning documents. Each item gets a short title and a one-sentence description of what it is. Do not filter or judge at this stage.

The goal is completeness, not quality. You want to drain every channel so that nothing lives in someone's head or a side spreadsheet. Once captured, tag each item with a rough category (feature, bug, tech debt, experiment) and the requesting source so you can trace its origin later.
Tip: Set up an intake channel, like a shared Slack channel, a form, or a dedicated email alias, so new requests flow into the backlog automatically rather than accumulating in scattered places.
Step 2: Write user stories for each item
Convert each raw item into a user story using the format: 'As a [type of user], I want [action or capability], so that [benefit or outcome].' This format forces you to identify who benefits, what they need, and why it matters. For bugs, the format adapts: 'As a [user], I expect [correct behavior], but currently [broken behavior], which causes [impact].' For technical debt, describe the engineering user: 'As a developer, I want [refactor or improvement], so that [reduced complexity, faster deployments, fewer incidents].' Not every item will fit neatly into this format, but forcing the attempt surfaces ambiguity early. If you cannot articulate the user or the benefit, the item needs more discovery before it earns backlog real estate.
Tip: Write stories from the user's perspective, not the system's perspective. 'The system sends a notification' is a task. 'As a buyer, I want to receive a confirmation email so that I know my order was placed' is a story that a designer, developer, and tester can all understand.
Step 3: Add acceptance criteria to the top items
For every item that could plausibly enter a sprint within the next 3-4 weeks, write acceptance criteria. These are the specific, testable conditions that must be true for the story to be considered done. ' Include edge cases, error states, and any non-functional requirements like performance thresholds or accessibility standards. Acceptance criteria serve three audiences: they tell developers what to build, testers what to verify, and the product owner what to accept.

If a story has more than 8-10 acceptance criteria, it is probably too large and should be split.
Tip: Involve a developer and a QA person when writing acceptance criteria. Product owners tend to describe the happy path. Developers think about error handling. Testers think about edge cases. The three perspectives together produce criteria that prevent surprises mid-sprint.
Step 4: Estimate effort using relative sizing
Assign effort estimates to refined items using a relative scale, most commonly the Fibonacci sequence (1, 2, 3, 5, 8, 13) for story points or t-shirt sizes (XS, S, M, L, XL). The key principle is that estimates are relative, not absolute. You are not predicting hours. You are comparing items against each other.

Pick a well-understood, small story as your reference 'one-pointer,' and size everything relative to it. Use planning poker or async estimation: each team member privately selects an estimate, then all reveal simultaneously. When estimates diverge by more than two Fibonacci numbers, discuss the outlier perspectives. Often, the person with the highest estimate sees a risk or complexity the others missed.

After discussion, re-estimate. Items estimated at 13 or higher should be flagged for splitting before they enter a sprint.
Tip: Track your team's actual velocity (total story points completed per sprint) over 3-5 sprints. Use the rolling average to forecast how much backlog the team can consume. This converts abstract story points into concrete sprint capacity.
Step 5: Prioritize by value, urgency, dependencies, and risk
Order the entire backlog from top to bottom. This is the product owner's primary responsibility, though it should be informed by input from the team and stakeholders. For each item, assess four dimensions. Value: how much does this item matter to users or the business?

Urgency: is there a deadline, a market window, or a cost of delay? Dependencies: does this item unblock other work, or is it blocked by something else? Risk: does this item reduce uncertainty, test a hypothesis, or address a technical risk? Items that score high on multiple dimensions rise to the top.

Items that score low on all four sink to the bottom or get removed entirely. Avoid the trap of ordering purely by value. A medium-value item with high urgency and dependency-unlocking power may deserve a higher position than a high-value item with no time pressure.
Tip: Use a lightweight scoring method like WSJF (Weighted Shortest Job First), where you divide the combined score of value, urgency, and risk reduction by the item's effort estimate. This naturally prioritizes high-value, low-effort items and deprioritizes low-value, high-effort ones.
Step 6: Hold regular refinement sessions
Schedule a recurring refinement session, typically once per week for 60-90 minutes, or twice per week for 30-45 minutes each. The attending group includes the product owner, the development team (or a subset), and optionally a designer. The session agenda follows a consistent pattern: review any newly added items and write stories for them, refine acceptance criteria on items approaching the top of the backlog, estimate any unestimated items in the upcoming sprint window, split any items that are too large, and re-order if priorities have shifted since the last session. Refinement is not a status meeting.

Come prepared with items queued for discussion, and leave with items that are measurably closer to sprint-ready. 5-2 sprints worth of backlog items should be fully refined, estimated, and ready to pull.
Tip: Timebox ruthlessly. If the team spends more than 10 minutes on a single item without resolution, it usually means the item needs more product discovery (customer research, design exploration, technical spike) before it can be refined. Park it, assign the discovery work, and revisit next session.
Step 7: Prune and maintain the backlog continuously
A healthy backlog has 40-80 items for a typical team. If your backlog has 200+ items, it is a dumping ground, not a planning tool. At least once per quarter, conduct a backlog pruning session. Review every item below the top 30 and ask: is this still relevant?

Has the context changed? If an item has sat untouched in the backlog for 6 months, it is almost certainly not important enough to keep. Close it with a note explaining why, and reassure stakeholders that closed does not mean forgotten. If conditions change, the item can be re-created with fresh context.

Beyond pruning, watch for duplicates that accumulate as different people submit similar requests. Merge them, preserve the context from each, and link to the canonical story.
Tip: Create a 'parking lot' or 'icebox' status for items that are interesting but not actionable in the next quarter. This keeps the active backlog lean while preserving ideas that might become relevant later.
Step 8: Communicate backlog status to stakeholders
The backlog is a communication artifact, not just a development queue. Share a summary with stakeholders at a regular cadence, typically biweekly or monthly. The summary should answer three questions: what is the team working on now, what is coming next, and what has been deprioritized or removed. Use themes or epics to group items into business-meaningful categories rather than listing individual stories.

Include velocity data and a rough forecast (not a commitment) of when key themes might be addressed based on current ordering and throughput. This transparency builds trust, reduces the volume of one-off status requests, and gives stakeholders a structured channel to challenge priorities rather than lobbying the product owner in hallway conversations.
Tip: Use a simple roadmap view (Now / Next / Later) mapped to your backlog ordering. Stakeholders care about themes and timelines, not individual story points. Translate backlog positions into language they understand.

Examples

Example: Early-stage B2B SaaS startup with 5 developers

A seed-stage startup building an invoicing tool for freelancers has a 4-person engineering team plus a designer. The CEO acts as product owner. They have 150 items scattered across a Notion doc, a Slack channel, and the CEO's notebook. There is no formal backlog, no estimates, and sprint planning is a 3-hour debate every two weeks.

The team consolidates all 150 items into a single Linear board in one 2-hour session. Each item gets a one-line title and a source tag (customer request, internal idea, bug report). The CEO then spends 90 minutes sorting the list into rough priority order, grouping items under 5 epics: onboarding, invoice creation, payment tracking, reporting, and integrations. In the first refinement session, the team picks the top 15 items and writes user stories with acceptance criteria for each.

They use planning poker to estimate the top 15, using a completed 'create invoice' story as their reference 1-pointer. Seven items score 8 or higher and are split into smaller stories. After splitting, the top of the backlog contains 22 refined, estimated items representing about 3 sprints of work at the team's estimated capacity of 35 points per sprint. The next sprint planning session takes 45 minutes instead of 3 hours because the team simply pulls from the refined queue.

After 3 sprints, the CEO prunes the backlog from 150 items to 65, closing 85 items that were duplicates, outdated, or clearly below the priority line for the next 6 months.

Example: Mid-size e-commerce company with multiple stakeholder groups

An e-commerce company with 200 employees has three product teams (buyer experience, seller tools, and platform infrastructure). The buyer experience team has a backlog of 300+ items with competing requests from marketing, merchandising, customer support, and the VP of Product. Priority conflicts are escalated to the VP weekly, consuming hours of leadership time.

The product manager implements a WSJF scoring model. Each item receives scores from 1-10 on three dimensions: business value (based on revenue impact and customer satisfaction data), time criticality (deadline-driven or competitive pressure), and risk reduction (does it reduce uncertainty or address a known technical risk). The effort estimate, already captured in t-shirt sizes, is converted to a numeric scale (S=2, M=5, L=8, XL=13). WSJF is calculated as (value + urgency + risk) / effort.

The product manager publishes the scored and ordered backlog to all stakeholders via a shared dashboard. 4), the numbers make the trade-off transparent without requiring VP intervention. The team holds two 45-minute refinement sessions per week, refining 6-8 items per session. Over two months, priority escalations to the VP drop from 4 per week to 1 per month because stakeholders can see the reasoning behind the ordering.

Example: B2C mobile app team managing a mix of features, bugs, and tech debt

A fitness app with 500,000 monthly active users has a 7-person mobile team. Their backlog has 90 items: 40 feature requests, 35 bugs, and 15 tech debt items. The engineering lead complains that tech debt never gets addressed because features always win. Bugs accumulate and user ratings are dropping.

The product owner restructures the single backlog by applying consistent prioritization criteria across all item types. ' This framing makes the user impact of bugs visible alongside feature requests. ' The team agrees to a sustainable allocation: roughly 60% features, 20% bugs, 20% tech debt each sprint. During refinement, the product owner interleaves bug and tech debt items into the ordered backlog at positions that reflect their actual impact.

The GPS migration (tech debt) and the route-saving bug both rank in the top 10 because they directly cause the 1-star reviews that threaten growth. 2, and the engineering lead reports that the codebase improvements are making feature delivery faster.

Example: Large enterprise team adopting agile with legacy constraints

A financial services company is transitioning a 15-person team from waterfall to agile. They have a 200-page requirements document that was the basis for a 12-month project plan. Leadership expects the same scope delivered but in an agile fashion. The team has never written a user story or estimated in story points.

The product owner and a scrum master decompose the requirements document into epics and stories over a one-week workshop. The 200-page document yields 12 epics and roughly 180 stories. Rather than refining all 180, they focus only on the first 2 epics (40 stories) that represent the foundational capabilities. The team runs a calibration session where they estimate 10 well-understood stories from the first epic to establish a baseline velocity.

With an average of 5 points per story and 40 stories in the first 2 epics, they estimate 200 points of work. Running 2-week sprints with 15 people, they discover their initial velocity is 45 points per sprint after the first 3 sprints. The product owner uses this data to show leadership that the original 12-month timeline maps to roughly 800 story points at current velocity, requiring approximately 18 sprints (9 months) if scope remains fixed. This transparent data enables a productive conversation about scope trade-offs rather than a political argument about timelines.

The team refines only 2-3 sprints ahead, treating the remaining 140 stories as rough placeholders that will be refined as they approach the top of the backlog.

Best Practices

Keep the backlog ordered, not just prioritized into tiers. A strict top-to-bottom ordering forces real trade-off decisions. Tier-based systems ('high / medium / low') let everything accumulate in the 'high' bucket, which defeats the purpose of prioritization. When the product owner can point to position #14 versus position #15, the ordering carries real meaning and enables clearer conversations with stakeholders about what comes first.
Refine items to different levels of detail based on their position. The top 15-20 items should have complete user stories, acceptance criteria, and estimates. Items in the middle third need a clear story and rough sizing but can skip detailed acceptance criteria. Items in the bottom third only need a title and a sentence describing the intent.

This gradient prevents wasted effort on items that may never be built while keeping the top of the backlog sprint-ready at all times.
Write acceptance criteria as testable conditions, not vague descriptions. 'The search should be fast' is not testable. 'Search results return within 200ms for queries under 50 characters on a dataset of 100,000 records' is testable. Vague criteria create arguments during sprint review about whether a story is actually done, which erodes team morale and product owner credibility.
Limit work-in-progress at the backlog level by capping how many epics are active simultaneously. If your team is working across 8 different epics, context-switching is high and none of them are progressing quickly. Aim for 2-3 active epics at a time. Finish one before starting another.

This constraint is uncomfortable for stakeholders who want everything in progress, but it dramatically improves throughput and reduces the average age of backlog items.
Use spikes (timeboxed research tasks) for items where the team cannot estimate confidently. A spike is a story whose deliverable is information, not code. 'Spend 4 hours investigating whether our database can support real-time search at 10x current volume, and document findings.' The spike output informs whether the original story is feasible, how large it is, and whether it should be split. Without spikes, teams either refuse to estimate (stalling refinement) or guess wildly (undermining velocity data).
Track and visualize backlog health metrics. Monitor the number of items in the backlog over time (should be stable or slowly growing, not exploding), the percentage of items in the top 20 that are fully refined, the average age of items, and the ratio of new items added per sprint versus items completed. These metrics surface problems early. A backlog that grows 30% faster than the team consumes it will eventually collapse under its own weight.
Separate discovery work from delivery work in the backlog. Items that need customer research, design exploration, or technical investigation should be tracked as discovery items with their own completion criteria. Mixing discovery and delivery in the same backlog creates confusion about what 'ready' means and leads to stories entering sprints before they are truly understood. Some teams use a dual-track approach with a discovery backlog feeding into the delivery backlog.

Common Mistakes

Treating the backlog as a feature wish list that only grows

Correction

' Over time, the backlog balloons to hundreds of items, most of which will never be built. The signal to watch for is a backlog where the bottom 50% of items have not been touched in 3+ months. Fix this by establishing a quarterly pruning cadence. Close items that have been inactive for two quarters.

Reassure stakeholders that closing an item does not delete the idea, and it can be re-opened with fresh context if circumstances change.

Writing stories that are too large to complete in a single sprint

Correction

Teams often write stories at the epic level ('As a user, I want a dashboard') because it feels faster than decomposing. ' Large stories hide complexity, make estimation unreliable, and prevent the team from demonstrating incremental progress. During refinement, apply the INVEST criteria: if a story cannot be completed and demonstrated within one sprint, split it. Split by user workflow, by data scope, by platform, or by happy path versus edge cases.

Each split story should deliver independently testable value.

Letting stakeholders dictate priority order without trade-off conversations

Correction

When every stakeholder's request is marked 'urgent' and placed at the top, the backlog becomes a political document rather than a planning tool. This happens when the product owner tries to keep everyone happy instead of making hard ordering decisions. The tell is a backlog where the top 10 items come from 10 different stakeholders with no thematic coherence. Fix this by making trade-offs explicit: 'If we move feature X to position #3, feature Y drops to position #8.

' Forcing transparency into priority decisions shifts the conversation from lobbying to reasoning.

Skipping refinement sessions when the team feels busy

Correction

' This creates a vicious cycle: the next sprint planning session takes twice as long because nothing is refined, the team pulls in poorly understood stories, quality drops, and the following sprint is even more pressured. The warning sign is sprint planning sessions that exceed 2 hours or stories that require mid-sprint scope clarification from the product owner. Protect refinement as a non-negotiable calendar hold. If the full team cannot attend, send a smaller group.

The cost of one hour of refinement is far less than the cost of a sprint spent building the wrong thing.

Estimating in hours or calendar days instead of relative points

Correction

Absolute time estimates feel intuitive but are consistently unreliable because they ignore variation in individual speed, interrupt load, and context-switching costs. Teams that estimate in hours tend to treat estimates as commitments, which creates pressure to cut corners or hide overruns. The fix is to adopt relative sizing (story points or t-shirt sizes) and track velocity over time. Velocity naturally accounts for all the real-world factors that make hour-based estimates inaccurate. When stakeholders ask 'how many hours,' translate through velocity: 'This is a 5-point story, and we complete about 30 points per sprint, so it represents roughly one-sixth of a sprint's capacity.'

Creating separate backlogs for different types of work (features, bugs, tech debt)

Correction

Splitting work into multiple backlogs makes it impossible to make informed trade-offs between a new feature, a critical bug, and a necessary refactor. The product owner cannot weigh a performance improvement against a new integration if they live in different lists. The result is that tech debt and bugs get systematically deprioritized because they are invisible to stakeholders reviewing the 'feature backlog.' Maintain one ordered backlog per team. Tag items by type for filtering and reporting, but the ordering must reflect the true priority across all types of work.

Other Skills in This Method

Comparing Agile and Waterfall for Project Selection

How to assess project characteristics, risk profiles, and organizational constraints to decide when agile outperforms waterfall and vice versa.

Choosing Between Scrum, Kanban, and Hybrid Approaches

How to evaluate your team's context and workflow to select the right agile framework — Scrum, Kanban, Scrumban, or a custom hybrid.

Running Sprint Planning and Execution

How to plan, scope, and execute time-boxed sprints including defining sprint goals, selecting backlog items, and managing sprint commitments.

Scaling Agile Across Multiple Teams and Departments

How to apply scaling frameworks like SAFe, LeSS, or Nexus to coordinate agile practices across multiple teams while preserving agility.

Coaching Teams Through Agile Adoption and Transformation

How to guide resistant or inexperienced teams through the agile transition by building trust, teaching agile values, and establishing sustainable practices.

Running Sprint Retrospectives for Continuous Improvement

How to facilitate retrospectives that generate honest feedback and produce actionable improvements the team actually implements.

Facilitating Effective Daily Stand-Up Meetings

How to run focused, time-boxed daily stand-up meetings that surface blockers, align the team, and maintain momentum without wasting time.

Related Skills from Other Methods

Grooming and Refining the Product Backlog: A Complete Guide to Scrum Backlog Refinement

Part of Scrum

Frequently Asked Questions

How many items should a healthy product backlog contain?

For a single team, aim for 40-80 items total. This is enough to provide several sprints of runway without becoming overwhelming. If your backlog exceeds 150 items, schedule a pruning session. Items that have sat untouched for 6+ months are almost certainly not important enough to maintain. A bloated backlog creates cognitive overhead, makes prioritization harder, and gives the false impression that every idea is being tracked and will eventually be built.

How often should backlog refinement happen?

Most teams benefit from one or two refinement sessions per week, totaling 60-90 minutes. The goal is to keep 1.5-2 sprints worth of items fully refined at all times. If your sprint planning sessions frequently devolve into scope discussions and story-writing, you are not refining enough. If your team is spending more than 10% of their sprint capacity on refinement, you may be over-refining or refining items too far down in the backlog.

Should I manage the backlog before or after sprint planning?

Refinement should happen continuously between sprint planning sessions, not during them. Sprint planning assumes the top of the backlog is already refined and estimated. The planning session is for selecting which refined items to commit to and discussing implementation approach, not for writing stories or debating acceptance criteria. If you find yourself doing refinement work during planning, shift that work to dedicated refinement sessions earlier in the sprint. See [running sprint planning](/skills/running-sprint-planning-and-execution) for how these ceremonies connect.

How do I handle conflicting priorities from multiple stakeholders?

Use a transparent scoring model like WSJF (Weighted Shortest Job First) that evaluates each item on objective dimensions: business value, time criticality, risk reduction, and effort. Publish the scores and the resulting order so stakeholders can see the reasoning behind priority decisions. When a stakeholder disagrees with an item's position, the conversation shifts from 'my feature is more important' to 'I believe the business value score should be higher because of these specific reasons.' This does not eliminate conflict, but it moves it from political lobbying to evidence-based discussion.

Why does my backlog keep growing faster than the team can deliver?

This is almost always an intake problem, not a delivery problem. Every stakeholder, customer, and team member can add items, but only the development team can remove them through completion. Implement intake discipline: require every new item to include a user story, a rough value assessment, and a requesting source before it enters the backlog. Hold a weekly triage where the product owner reviews new additions and either accepts them into the ordered backlog or parks them in an icebox for quarterly review. Also verify that you are pruning regularly. Closing stale items is not admitting defeat. It is maintaining a usable planning tool.

How do I split a user story that feels too big but I cannot see how to break it down?

Apply one of these splitting patterns: split by workflow step (registration vs. login vs. password reset), split by data type or input method (manual entry vs. CSV upload vs. API sync), split by user role (admin view vs. end-user view), split by business rule (basic pricing vs. discount pricing vs. bulk pricing), or split by happy path vs. edge cases (build the core flow first, handle errors in a follow-up story). The test for a good split is that each resulting story delivers independently demonstrable value and can be shipped without the other stories being done.

Should I use story points, t-shirt sizes, or something else for estimation?

Both story points and t-shirt sizes work well because they enforce relative sizing rather than absolute time predictions. Story points (Fibonacci: 1, 2, 3, 5, 8, 13) provide more granularity and make velocity calculation straightforward. T-shirt sizes (XS, S, M, L, XL) are simpler for teams new to estimation and reduce arguments about the difference between a 3 and a 5. The method matters less than consistency. Pick one approach, use it for at least 5 sprints to establish a baseline velocity, and do not switch unless you have a clear reason. The worst approach is estimating in hours, because it creates a false sense of precision and invites micromanagement.

Managing and Refining a Product Backlog for Agile Project Management

Prerequisites

Overview

How It Works

Step-by-Step

Step 1: Capture everything into a single list

Step 2: Write user stories for each item

Step 3: Add acceptance criteria to the top items

Step 4: Estimate effort using relative sizing

Step 5: Prioritize by value, urgency, dependencies, and risk

Step 6: Hold regular refinement sessions

Step 7: Prune and maintain the backlog continuously

Step 8: Communicate backlog status to stakeholders

Examples

Example: Early-stage B2B SaaS startup with 5 developers

Example: Mid-size e-commerce company with multiple stakeholder groups

Example: B2C mobile app team managing a mix of features, bugs, and tech debt

Example: Large enterprise team adopting agile with legacy constraints

Best Practices

Common Mistakes

Other Skills in This Method

Comparing Agile and Waterfall for Project Selection

Choosing Between Scrum, Kanban, and Hybrid Approaches

Running Sprint Planning and Execution

Scaling Agile Across Multiple Teams and Departments

Coaching Teams Through Agile Adoption and Transformation

Running Sprint Retrospectives for Continuous Improvement

Facilitating Effective Daily Stand-Up Meetings

Related Skills from Other Methods

Frequently Asked Questions