Designing Assumption Tests and Experiments for Solutions: A Product Manager Certification Skill

This skill teaches you how to surface the riskiest assumptions behind each proposed solution on your Opportunity Solution Tree and design lightweight experiments—prototypes, fake doors, or concierge tests—to validate them quickly before committing engineering effort.

Start by listing every assumption behind a proposed solution, then rank each by risk—how uncertain it is and how catastrophic failure would be. For the riskiest assumptions, design the smallest possible experiment (prototype, fake door, concierge test, or data analysis) that produces a clear pass/fail signal. Define your success criteria before running the test, timebox it, and use the evidence to decide whether to proceed, pivot, or kill the solution.

Outcome: You will be able to systematically de-risk any proposed solution by identifying its most dangerous assumptions and designing fast, cheap experiments that produce actionable evidence—reducing wasted engineering effort and increasing your team's confidence in what to build.

Synthesized from public framework references and reviewed for accuracy.

ProductIntermediate60-90 minutes per solution

Prerequisites

  • Generating Multiple Solutions for Each Opportunity
  • Basic understanding of the Opportunity Solution Tree framework
  • Familiarity with common experiment types (prototypes, fake doors, concierge tests, Wizard of Oz)
  • Access to customer research or user feedback channels

Overview

Every solution your team proposes rests on a stack of assumptions: assumptions about customer desirability, technical feasibility, business viability, and usability. Most teams skip straight from idea to backlog, only discovering broken assumptions after weeks or months of build time. Designing assumption tests is the discipline that prevents this waste.

Within the Opportunity Solution Tree framework, assumption testing sits at the bottom of the tree—directly beneath each candidate solution. It is the mechanism that turns discovery from opinion-driven debate into evidence-driven decision-making. Teams pursuing a product manager certification will find this skill essential because it operationalizes the core discovery loop: generate solutions, test assumptions, learn, and iterate.

This skill is not about running A/B tests on shipped features. It is about designing the smallest, fastest experiments that tell you whether a solution is worth building at all. When done well, you can test multiple solutions in parallel for a fraction of the cost of building even one of them.

How It Works

The core logic is simple: every solution is a bet, and every bet has assumptions. Some assumptions are low-risk (you already have strong evidence they're true), while others are high-risk (they're uncertain and would invalidate the entire solution if wrong). The goal is to find and test the high-risk assumptions first.

Assumptions generally fall into four categories:

  • Desirability assumptions: Will customers want this? Will they switch from their current behavior?
  • Usability assumptions: Can customers figure out how to use this? Will they complete the key workflow?
  • Feasibility assumptions: Can we actually build this with our current technology, data, and team?
  • Viability assumptions: Does this work for our business model? Can we afford the support costs? Does it comply with regulations?

Once you've mapped assumptions, you rank them on a 2×2 of uncertainty (how little evidence you have) and importance (how catastrophic it would be if the assumption is wrong). The riskiest assumptions—high uncertainty, high importance—get tested first.

For each risky assumption, you then select the lightest-weight experiment type that can produce a clear signal. A fake-door test might take hours to set up; a concierge test might take a day. The key constraint is that every experiment must have a pre-defined success criterion so the team knows exactly how to interpret the results.

Step-by-Step

  1. Step 1: Select a solution from your Opportunity Solution Tree

    Choose one candidate solution that your team has generated for a prioritized opportunity. If you've followed the Generating Multiple Solutions for Each Opportunity skill, you should have at least three solutions to consider. Pick the one the team is most excited about or the one with the highest expected impact—you'll test its riskiest assumptions before investing build effort.

    Write a one-sentence description of the solution that is specific enough for anyone on the team to understand what you'd actually build. Vague descriptions like 'improve onboarding' produce vague assumptions. Specific descriptions like 'add a 3-step interactive checklist to the first-run experience that guides users through connecting their data source, creating their first report, and inviting a teammate' produce testable assumptions.

    Tip: If you can't describe the solution in one specific sentence, you probably need to break it into smaller solution ideas first.

  2. Step 2: List every assumption behind the solution

    Gather your product trio (product manager, designer, engineer) and spend 10-15 minutes brainstorming every assumption the solution rests on. Use the four-category framework—desirability, usability, feasibility, viability—as prompts.

    For each category, ask: 'What would have to be true for this solution to succeed?' Write each assumption as a declarative statement, e.g., 'Users will notice the checklist in the first 30 seconds,' or 'Our current API can return the required data in under 200ms.' Aim for 8-20 assumptions per solution. If you have fewer than 8, you're probably not thinking critically enough; if you have more than 20, some are likely duplicates or trivially true.

    Don't filter or judge assumptions during brainstorming—capture them all. You'll prioritize in the next step.

    Tip: Engineers often surface feasibility assumptions that PMs and designers miss, and designers often catch usability assumptions that engineers overlook. This is why the product trio is essential for this exercise.

  3. Step 3: Map assumptions on a risk matrix

    Draw a 2×2 matrix with 'Uncertainty' on the x-axis (low to high) and 'Impact if Wrong' on the y-axis (low to high). Place each assumption on the matrix based on a quick team discussion.

    High-uncertainty assumptions are those where you have little or no evidence—you're guessing. High-impact assumptions are those that would kill the solution or require a fundamental redesign if they turn out to be false.

    The top-right quadrant (high uncertainty + high impact) contains your riskiest assumptions. These are the ones you must test before building anything. The bottom-left quadrant (low uncertainty + low impact) can be safely ignored for now. Focus your testing energy on the 2-4 riskiest assumptions.

    Tip: If the team can't agree on where an assumption falls, that disagreement is itself evidence of high uncertainty—move it to the right side of the matrix.

  4. Step 4: Choose the right experiment type for each risky assumption

    For each of the 2-4 riskiest assumptions, select the lightest-weight experiment that can produce a clear signal. Common experiment types include:

    • Fake-door test: Show users a button, link, or feature announcement for something that doesn't exist yet. Measure click-through or sign-up rate. Best for desirability assumptions.
    • Prototype test: Build a clickable mockup (Figma, paper, etc.) and run 5-8 moderated usability sessions. Best for usability and desirability assumptions.
    • Concierge test: Manually deliver the value the solution would automate. Best for desirability and viability assumptions (you learn both whether users want it and what it actually costs to deliver).
    • Wizard of Oz test: Present an automated-looking interface, but handle requests manually behind the scenes. Best for desirability + feasibility assumptions simultaneously.
    • Data analysis / spike: Query existing product data or build a quick technical proof-of-concept. Best for feasibility assumptions.
    • Survey or one-question poll: Ask a targeted question to existing users. Best for validating desirability assumptions when you need quantitative signal quickly.

    Match the experiment type to the assumption category. Don't use a prototype to test a feasibility assumption, and don't use a technical spike to test desirability.

    Tip: Default to the experiment that takes the least time and money. If a 2-hour fake-door test can answer the question, don't build a 2-week prototype.

  5. Step 5: Define success criteria before running the experiment

    For each experiment, write down—before you start—what result would make you confident the assumption is true, what result would make you confident it's false, and what result would be ambiguous.

    Be specific. Instead of 'Users are interested,' write 'At least 15% of users who see the fake door click through.' Instead of 'The prototype is usable,' write 'At least 4 out of 6 participants complete the task without assistance.'

    Also define the timeline. Every experiment should have a timebox—typically 1-5 days for lightweight tests. If your experiment takes longer than a sprint, it's probably too heavy.

    Tip: Write your success criteria in a shared document and have the full product trio sign off before running the experiment. This prevents post-hoc rationalization where the team reinterprets ambiguous results as a pass.

  6. Step 6: Run the experiment and collect evidence

    Execute the experiment according to your plan. During execution, resist the urge to change the experiment mid-stream unless you discover a critical flaw in the setup. Changing variables mid-experiment invalidates your results.

    Document everything: raw data, participant quotes, click counts, technical findings. You want your evidence to be reviewable by anyone on the team, not just the person who ran the test.

    If you're running moderated sessions (e.g., prototype tests), record them with participant consent. If you're running quantitative tests (e.g., fake doors), screenshot the analytics dashboard at the end of the timebox.

    Tip: Assign one person to be the 'experiment owner' responsible for setup, execution, and documentation. Shared ownership often means no one actually runs the test.

  7. Step 7: Interpret results and make a decision

    Compare your results against the pre-defined success criteria. There are three possible outcomes:

    1. Assumption validated: The evidence supports the assumption. Move on to the next riskiest assumption or, if all critical assumptions are validated, move the solution into delivery.
    2. Assumption invalidated: The evidence contradicts the assumption. Either kill the solution, pivot it to address the failed assumption, or return to the Opportunity Solution Tree and explore alternative solutions for the same opportunity.
    3. Results ambiguous: The data doesn't clearly support or refute the assumption. Design a different, more targeted experiment, or accept the uncertainty and move to the next assumption.

    Update your Opportunity Solution Tree with the results. Each experiment node should show what was tested, the result, and the decision made. This creates an auditable trail of learning that informs future discovery work and is invaluable when pursuing a product manager certification that emphasizes evidence-based practice.

    Tip: Ambiguous results are the most dangerous outcome because teams tend to interpret them optimistically. If results are ambiguous, default to 'not yet validated' rather than 'good enough.'

Examples

Example: Testing a Smart Notification Feature for a Project Management Tool

A product trio at a SaaS project management company has identified an opportunity: 'Project managers miss critical updates because they're buried in a noisy notification feed.' They've generated a solution: 'AI-powered smart notifications that surface only high-priority updates based on the user's role and recent activity.' The team needs to validate this before committing a quarter of engineering effort.

The team lists 12 assumptions. After mapping them on the risk matrix, four stand out as high-risk:

  1. Desirability: Project managers will trust an AI to filter their notifications (high uncertainty—users have shown distrust of automated filtering in past research).
  2. Feasibility: The existing activity data is sufficient to build a useful priority model (uncertain—the data team hasn't audited this).
  3. Usability: Users will understand why certain notifications were surfaced and others weren't (uncertain—transparency in AI filtering is notoriously hard).
  4. Viability: The ML infrastructure costs will stay within the per-user margin (uncertain—no cost estimate exists).

For assumption #1, they run a concierge test: a designer manually curates a 'smart' notification digest for 8 power users over 5 days and collects daily feedback. Success criterion: at least 6 of 8 users rate the curated digest as more useful than the current feed.

For assumption #2, an engineer runs a data spike: query the activity logs for 100 projects and assess whether role-based activity patterns are detectable. Success criterion: the engineer can manually classify at least 80% of notifications as high/low priority using available data fields.

Results after one week: The concierge test passes (7/8 users prefer the curated digest), but the data spike reveals that role data is missing for 40% of accounts. The team pivots the solution to use explicit user preferences instead of role inference, then designs a follow-up prototype test for the updated approach. The Opportunity Solution Tree is updated with experiment nodes showing what was learned.

Example: Fake-Door Test for a Collaborative Budgeting Feature

A fintech product team has an opportunity: 'Couples struggle to align on monthly spending priorities.' One proposed solution is a shared budgeting workspace where both partners can set and negotiate spending limits per category. The team suspects desirability is the riskiest assumption—couples may not want to share financial details within the app.

The team designs a fake-door test. They add a banner to the dashboard of existing users who have linked accounts: 'New: Share your budget with your partner. Get started →'. Clicking the banner leads to a waitlist page that says 'Coming soon—join the waitlist to be the first to try Shared Budgets.'

Success criterion (defined before launch): At least 8% of users who see the banner click through, and at least 30% of those who click submit their email for the waitlist.

After 5 days and 2,400 banner impressions: 11.2% clicked through and 42% submitted an email. The desirability assumption is validated. The team moves to the next riskiest assumption—usability—and designs a prototype test of the budget negotiation workflow using Figma. This disciplined, step-by-step approach to assumption testing is a hallmark of the product manager certification skill set and keeps the team anchored to the Opportunity Solution Tree throughout discovery.

Best Practices

  • Test assumptions in parallel when possible: run a fake-door test for desirability while an engineer runs a feasibility spike, so you compress the learning timeline from weeks to days.

  • Keep experiments atomic—test one assumption per experiment. Bundling multiple assumptions into one test makes it impossible to know which assumption failed if results are negative.

  • Always define success criteria before running the experiment and document them in a shared space visible to the entire product trio and stakeholders.

  • Timebox every experiment. If a test can't produce a signal within one week, redesign it to be lighter-weight or split the assumption into smaller, more testable sub-assumptions.

  • Create an assumption testing log (spreadsheet or Notion table) that tracks solution, assumption, experiment type, success criteria, result, and decision. This becomes institutional knowledge for the team.

  • Review failed experiments with curiosity, not blame. A failed assumption is a successful experiment—it saved weeks or months of wasted build time.

Common Mistakes

Testing the easiest assumptions instead of the riskiest ones

Correction

Always use the uncertainty × impact matrix to prioritize. The goal is to de-risk the solution as fast as possible, which means tackling the scariest assumptions first—even if they're harder to test.

Building an over-engineered experiment that takes as long as building the actual feature

Correction

Constantly ask: 'What is the smallest thing I can do to learn whether this assumption is true?' A 5-user prototype test or a 2-hour fake-door test almost always produces enough signal. Save high-fidelity experiments for later validation stages.

Not defining success criteria before running the experiment, then rationalizing ambiguous results as a pass

Correction

Write specific, quantitative (or clearly observable) success criteria before the experiment begins, and have at least two team members agree on them. This prevents confirmation bias from turning a 6% click rate into 'users are excited.'

Only testing desirability assumptions and ignoring feasibility and viability

Correction

Include your engineer and business stakeholders in the assumption mapping exercise. Technical constraints and business model risks kill solutions just as often as lack of customer demand.

Treating assumption testing as a one-time gate instead of a continuous practice

Correction

New assumptions surface as you learn. After each experiment, check whether the results have introduced new risks. Embed assumption testing into your weekly discovery cadence, not just your project kickoff.

Frequently Asked Questions

How many assumptions should I test before committing to building a solution?

Focus on the 2-4 riskiest assumptions—those that are both highly uncertain and would invalidate the solution if wrong. You don't need to test every assumption, just the ones that represent existential risk to the solution's success.

What's the difference between a fake-door test and a Wizard of Oz test?

A fake-door test measures interest by showing a feature entry point that doesn't lead to a real feature (e.g., a button that goes to a waitlist). A Wizard of Oz test actually delivers the experience, but a human performs the work behind the scenes instead of technology. Fake doors test desirability; Wizard of Oz tests desirability and feasibility together.

How does assumption testing fit into the Opportunity Solution Tree?

In the Opportunity Solution Tree framework, experiments are the leaf nodes beneath each candidate solution. They represent the evidence layer that tells you whether a solution is worth building. Without assumption tests, the tree is just a hypothesis map—with them, it becomes an evidence-based decision tool.

Can assumption testing replace A/B testing after launch?

No. Assumption tests happen before you build, using lightweight experiments to validate whether a solution is worth engineering effort. A/B tests happen after you build, measuring the impact of a shipped feature on real user behavior at scale. Both are valuable at different stages of the product lifecycle.

Is designing assumption tests a skill covered in product manager certification programs?

Yes. Most reputable product manager certification programs—including those focused on continuous discovery—teach assumption mapping and experiment design as core competencies. This skill is central to evidence-based product management and is increasingly expected in PM interviews and portfolio reviews.

How do I convince stakeholders that assumption testing is worth the time?

Frame it as risk reduction, not delay. Show the cost of building a feature that fails (typically weeks or months of engineering time) versus the cost of running a 3-5 day experiment. Most stakeholders respond well to the argument: 'We can spend 3 days learning whether this will work, or 3 months building something that might not.'