Conversion & funnels

The peeking problem in A/B tests

The peeking problem is checking an experiment over and over and stopping the moment it crosses significance. Because each look is another chance for noise to cross the threshold, repeated peeking inflates the false-positive rate well above the nominal level. The fixes are a pre-set sample size or a sequential method designed for continuous monitoring.

Verified against primary sources

What this means

Classical significance tests assume you decide the sample size in advance and look once at the end. Peeking breaks that assumption: every time you check and reserve the right to stop, you give noise another opportunity to wander across the threshold. Do this many times and the probability of a false 'win' climbs far above the 5% you thought you were controlling.

How to avoid it

The simplest fix is to fix the sample size up front and only conclude when you reach it. If you genuinely need to monitor continuously — to stop a harmful change early — use a sequential testing method (such as alpha-spending or always-valid inference) that is mathematically designed to allow repeated looks without inflating error. What you must not do is run an ordinary fixed-horizon test and stop at the first green result.

Peeking is one of the most common reasons A/B 'wins' fail to replicate.

Each extra look adds a chance for noise to cross
Fixed-horizon tests assume one look at the end
Sequential methods are built for continuous monitoring

How it appears in analytics and logs

If a 'significant' result came from peeking, its significance is overstated — the real chance of a false positive is higher than the p-value suggests, so the win may not be real.

Diagnostic use case

Avoid stopping a fixed-horizon test early by repeatedly checking it; either run to the planned sample or use a method built for sequential looks.

What WebmasterID can help detect

WebmasterID reports the conversion counts you monitor first-party; the discipline of when to stop the test stays with you.

Common mistakes

Stopping a fixed-horizon test at the first significant peek.
Treating a peeked p-value as if it controlled error correctly.
Monitoring continuously without a sequential method.

Privacy and accuracy notes

Peeking is a procedural pitfall over aggregate counts; it involves no personal data. WebmasterID supplies the first-party counts being monitored.

↑ All conversion topics in Conversion & funnels

Sources and verification notes

NIST/SEMATECH e-Handbook — Sequential and repeated testing conceptsBackground on why repeated monitoring requires adjusted procedures.

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.