Conversion & funnels

Sample size in experiments

Sample size is the number of subjects per arm an experiment needs to detect a chosen effect with acceptable error rates. It is computed in advance from the baseline rate, the minimum effect worth detecting, and the false-positive and false-negative rates you accept. Too small and you miss real effects; running until 'it looks good' inflates false positives.

Verified against primary sources

What this means

Sample size is how many subjects per variant you need before the experiment can reliably tell signal from noise. It depends on four things: the baseline conversion rate, the smallest effect you care about (the minimum detectable effect), the significance level (false-positive rate), and the power (one minus the false-negative rate). Fix those and the required size follows.

Why plan it first

If the sample is too small, the test is underpowered: a real effect can be present yet go undetected, and a null result means little. If instead you run with no target and stop when the numbers look good, you are peeking, which inflates false positives. Computing the size up front gives a clear stopping point and an honest interpretation either way.

Smaller effects need larger samples — detecting a tiny improvement can require far more traffic than detecting a large one.

Depends on baseline, effect size, significance, power
Underpowered tests miss real effects
A pre-set size prevents result-peeking

How it appears in analytics and logs

A sample-size calculation tells you the traffic an experiment needs to reliably detect the effect you care about. Falling short means a 'no difference' result may simply be an underpowered test.

Diagnostic use case

Compute the required sample size before launching so you know how long to run and avoid both underpowered tests and result-peeking.

What WebmasterID can help detect

WebmasterID's first-party conversion counts give you the baseline rate a sample-size calculation starts from.

Common mistakes

Launching without computing the required sample size.
Stopping at an arbitrary point because the result looks good.
Expecting to detect tiny effects with little traffic.

Privacy and accuracy notes

Sample-size planning uses aggregate rates, not personal data. WebmasterID supplies the first-party counts the calculation needs.

↑ All conversion topics in Conversion & funnels

Sources and verification notes

NIST/SEMATECH e-Handbook — Sample sizes for hypothesis tests

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.