Confidence intervals for conversion metrics
A confidence interval gives a range of plausible values for a metric rather than a single point. A 95% confidence interval is constructed so that, over many repeats, that procedure captures the true value 95% of the time. Reporting an interval communicates uncertainty honestly — a conversion rate of 4% with a wide interval is a very different claim than a narrow one.
What this means
A confidence interval is a range around an estimate that reflects sampling uncertainty. With a small sample the interval is wide; with a large sample it narrows. A 95% confidence level refers to the procedure: if you repeated the sampling many times, about 95% of the intervals built this way would contain the true value.
Reading it correctly
A single conversion percentage hides how much it could move on a different sample; the interval makes that visible. It is wrong to say 'there is a 95% probability the true value is in this particular interval' — the level describes the long-run behaviour of the method, not one interval. When two variants' intervals overlap heavily, be cautious about declaring a winner.
Intervals and significance tests are two views of the same uncertainty; report the interval so the size of the effect is visible, not just whether it crossed a threshold.
- Wider interval = more uncertainty (smaller sample)
- 95% describes the procedure, not one interval
- Heavily overlapping intervals warn against calling a winner
How it appears in analytics and logs
A wide interval means the estimate is uncertain; a narrow one means it is well-pinned. Overlapping intervals between two variants warn that an apparent difference may be noise.
Diagnostic use case
Report a confidence interval alongside any conversion estimate so readers see the uncertainty, especially when samples are small.
What WebmasterID can help detect
WebmasterID's first-party conversion counts are the raw input from which you can compute an interval around any rate.
Common mistakes
- Reporting a point estimate with no interval.
- Saying the true value has a 95% chance of being in one interval.
- Declaring a winner when intervals overlap heavily.
Privacy and accuracy notes
Confidence intervals are computed from aggregate counts, not identity. WebmasterID supplies the first-party counts they summarise.
Related pages
- Statistical significance and p-values
A result is 'statistically significant' when it would be unlikely if there were really no effect. The p-value is the probability of seeing data at least as extreme as yours assuming the null hypothesis is true — it is not the probability the variant is better, and not a measure of how big the effect is. Significance and practical importance are different questions.
- Sample size in experiments
Sample size is the number of subjects per arm an experiment needs to detect a chosen effect with acceptable error rates. It is computed in advance from the baseline rate, the minimum effect worth detecting, and the false-positive and false-negative rates you accept. Too small and you miss real effects; running until 'it looks good' inflates false positives.
- Conversion rate: definition and denominators
Conversion rate is the share of some base that converted. The trap is the denominator: conversions per session, per user, and per unique visitor give different numbers and mean different things. Without stating the base, a conversion rate is ambiguous — and comparing rates with different bases is meaningless.
- WebmasterID docs
Compute intervals from first-party counts.
Sources and verification notes
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.