Conversion & funnels

Designing an experiment hypothesis

Before running an A/B test you write a hypothesis: a falsifiable statement linking a specific change to an expected effect on a named metric, for a defined audience, with a rationale. A good hypothesis fixes the success metric in advance, which prevents post-hoc metric shopping. This page covers the structure of a hypothesis and the reasoning behind it.

Partially verified

What a hypothesis contains

A complete experiment hypothesis has four parts: the evidence or rationale that motivates it, the specific change being made, the expected directional effect on one primary metric, and the audience it applies to. Stating the direction (increase or decrease) in advance is what makes the test falsifiable.

Without a committed metric, a flat overall result invites cherry-picking a segment that happened to move — which is how false findings enter a roadmap.

Rationale: what evidence motivates the change
Change: the precise variant being shipped
Primary metric and expected direction
Audience: who is eligible for the test

Why fixing the metric matters

Choosing the primary metric before data arrives is the core discipline. If you decide what counts as success only after seeing results, every test 'wins' on something, because random variation guarantees some metric moved. Pre-registration of the metric and direction is the standard guard against this.

Connecting to design choices

The hypothesis drives downstream design: it determines the minimum detectable effect worth chasing, the sample size, and the guardrail metrics you watch for harm. A hypothesis that targets a tiny effect on a low-traffic page may be impractical to test at all — better to learn that before launching.

How it appears in analytics and logs

A vague hypothesis ('let's try a new button') gives no way to interpret a flat or negative result. A specific one names the metric and direction in advance, so the readout is unambiguous.

Diagnostic use case

Write each experiment as 'because [evidence], changing [X] will [increase/decrease] [metric] for [audience]' so the test can falsify a claim, not just observe.

What WebmasterID can help detect

WebmasterID's first-party events let you define the success metric for a hypothesis from real funnel steps, so the metric you commit to is one you can actually measure cleanly.

Common mistakes

Writing 'let's try X' with no metric or expected direction.
Choosing the success metric after seeing the results.
Stating a hypothesis you cannot power given available traffic.

Privacy and accuracy notes

Hypotheses describe aggregate behaviour, not individuals. Designing a test needs no personal data — only the metric definition and the audience segment.

↑ All conversion topics in Conversion & funnels

Sources and verification notes

Wikipedia — Statistical hypothesis testingFalsifiable hypotheses and pre-specified metrics.

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.