LAUNCH EXPERIMENT PROGRAM — $7,997 · 6-week engagement
Jake McMahon · ProductQuant
A 6-week engagement that takes your team from 0–2 tests per quarter to a running experiment program — with the metrics architecture, infrastructure, hypothesis backlog, and Notion OS that keeps it running after the engagement ends.
Running experiment program with at least one test live at the end · full refund guarantee
WHAT YOU HAVE AT THE END
$7,997 · fixed price · 6-week engagement
What happens when experimentation runs on enthusiasm instead of infrastructure
The roadmap is driven by the loudest voice in the room, the most recent customer complaint, or the founder’s intuition. Everyone agrees that experiments would help. Nobody has built the infrastructure to run them systematically. The gap between “wanting to experiment” and “running experiments” stays open for quarters.
“We keep saying we’ll be more data-driven this quarter. We keep making product decisions the same way we always have.”
Tests get called early when the early data looks promising. Results get challenged because the primary metric was never agreed upfront. Post-test debate — where everyone cites a different metric that supports their pre-existing view — absorbs more time than the test itself. The experiment program dies because it produces controversy, not decisions.
“We ran the test. It ‘won’ on the metric we tracked. Engineering shipped it. Retention dropped. Nobody can explain why.”
The team has the instinct to test but not the framework. What is the right sample size? How long should the test run? How do we score hypotheses against each other? What happens if the test reaches significance early? Every experiment starts from scratch because there is no repeatable process — and the answers require a data scientist who is not in the room.
“We have good ideas. We just can’t run them rigorously enough for the results to mean anything.”
A previous growth PM set up a testing process. It lived in their head and one spreadsheet. They left. The spreadsheet stopped being updated. Within a quarter, the team was back to shipping features based on intuition, and nobody could find the experiment log. Programs that live in one person’s head are not programs — they are single points of failure.
“We had a testing process. Then our growth lead left. Now we have a spreadsheet nobody updates.”
WHY THIS IS DIFFERENT
Most experiment programs die because they are built on enthusiasm, not infrastructure.
A typical approach to starting an experiment program: a growth PM builds a spreadsheet, writes some hypothesis templates, and runs the first two tests. The first quarter looks promising. By quarter three, the program has stalled — because the primary metric was never agreed, the hypothesis scoring was informal, and the process lived in one person’s head rather than a system the whole team can use.
The engagement builds the infrastructure that keeps the program running after it ends. The metrics hierarchy is agreed in writing before any test launches. The sample size calculator removes the “how long do we run this?” question permanently. The Notion OS is institutional — owned by the team, not by the person who built it. The hypothesis backlog gives 20+ months of scored, structured test ideas, so the program is never stalled for lack of something to test.
Six weeks from now, the program is running. One test is live. The team knows how to spec the next one without Jake in the room. The HiPPO no longer wins product debates by default — because there is a process for running the test that would settle it.
WHAT YOU GET
The North Star metric, leading indicators, and guardrail metrics agreed in writing before any test runs. The document that ends post-test debates — because every metric that matters was defined before the experiment launched, not after the results came in.
Your analytics stack assessed against what the experiment program needs. Which events are reliable, which are broken, and which critical behaviours have no tracking at all — before experiments are designed around data that cannot support them.
The first five experiment designs, fully specced and ready to run — not a list of ideas but structured test designs with the hypothesis, primary metric, guardrail metrics, sample size, expected runtime, and ship/no-ship criteria already defined.
The Notion workspace built, populated, and handed over as the institutional home for the experiment program. Not a template — a pre-populated workspace with the real experiment data from the engagement already inside it.
20+ scored, structured test ideas ready to queue — plus the prioritisation framework for evaluating new ideas as they come in so the backlog never runs dry and the HiPPO dynamic is replaced by evidence-based scoring.
A 2-hour working session with your product and growth team to walk through the metrics hierarchy, the experiment design process, the Notion OS, and the hypothesis scoring framework. The session is designed so anyone on the team can run the next test without Jake in the room.
A sequenced 6-month experiment calendar built from the hypothesis backlog — which tests to run, in what order, and why. Sequenced so each test builds on the last: early tests answer foundational questions, later tests optimise based on what the program learned.
THE TIMELINE
Instrumentation audit completed and scored. Metrics hierarchy documented — North Star, driver metrics, and guardrails agreed with leadership sign-off. The pre-work that removes every post-test debate before the first test launches.
First 5 experiment designs fully specced. First test launched and running. Notion OS built and populated with the first experiments, results library, weekly review agenda, and decision record. The institutional home for the program is live and owned by your team.
20+ hypothesis backlog delivered and scored. 6-month experiment calendar built from the backlog. 2-hour team training session completed — walkthrough of the metrics hierarchy, experiment design process, hypothesis scoring, and Notion OS. The team leaves the session able to run the next test independently.
WHO THIS IS FOR
Not sure if your analytics are ready to support a testing program? The instrumentation audit in Week 1 will tell you what gaps need addressing before the first test runs — including whether those gaps are blocking or just advisory.
WHO’S DOING THE WORK
Jake McMahon — ProductQuant
I run this engagement myself. Eight years as a product and growth lead inside B2B SaaS, watching smart teams make the same mistake: good tools, good instincts, no system. Experiment programs that live in one spreadsheet and one person’s head are not programs — they are single points of failure. The Notion OS and the metrics hierarchy are designed specifically to remove that dependency.
The most common place experiment programs break is the metric agreement step. Everyone has a view on the primary metric. Getting leadership sign-off on one number before the test runs — and keeping that number fixed when the results come in — is the constraint the engagement is built around. That problem is pre-hypothesis and pre-infrastructure. Fixing it first is what makes everything downstream work.
Teams Jake has worked with







PRICING
Guarantee: Running experiment program with at least one test live at the end of the engagement, or a full refund. The program exists and is operating — or you don’t pay.
Six weeks from now your team has the metrics architecture, the infrastructure, and the hypothesis backlog to run experiments systematically — and one test already live to prove it.