Debugging a sample ratio mismatch
A sample ratio mismatch (SRM) — observed variant counts that differ from the intended split by more than chance — invalidates a test, because whatever broke the ratio likely biased the metrics too. Debugging SRM is a systematic hunt: check the assignment mechanism, redirect and timing effects, bot filtering, logging gaps, and analysis filters that drop one arm unevenly. This entry is the troubleshooting procedure, not the definition.
Confirm it is a real SRM
First test the split itself: a chi-square (or binomial) test on the observed counts against the intended ratio tells you whether the deviation exceeds chance. A tiny imbalance in a huge sample can be significant yet immaterial, and a large relative gap in a small sample may be noise — so confirm both significance and that the assignment was supposed to be even. Only a confirmed SRM warrants halting interpretation.
- Chi-square the counts against the intended split
- Distinguish a real SRM from sampling noise
- Confirmed SRM ⇒ stop trusting the metrics
Trace the cause
Walk the pipeline: (1) assignment — is bucketing deterministic and unbiased, or does a redirect drop users before they are counted? (2) timing — does one variant load slower and lose impatient users before logging? (3) bots — is automated traffic filtered consistently across arms? (4) logging — are events lost for one variant on some browsers? (5) analysis filters — does a downstream filter remove one arm unevenly? Fix the cause, then rerun; do not 'correct' the ratio after the fact.
SRM is the canary; the underlying bug usually also biased the conversion numbers.
How it appears in analytics and logs
A statistically significant deviation from the intended split means assignment, logging, or filtering is broken — treat the test as untrustworthy until found.
Diagnostic use case
Work through a fixed checklist to locate the cause of an SRM before trusting any result, since the imbalance usually signals a deeper bias.
What WebmasterID can help detect
WebmasterID's first-party per-arm counts and bot filtering help separate a real assignment bug from bot contamination.
Common mistakes
- Interpreting metrics from a test with an unresolved SRM.
- Re-weighting arms to 'fix' the ratio instead of finding the bug.
- Overlooking redirect or load-time differences that drop one arm.
Privacy and accuracy notes
SRM debugging uses aggregate counts per arm; no personal data is needed to detect or trace the imbalance.
Related pages
- Sample ratio mismatch (SRM)
Sample ratio mismatch (SRM) is when the observed allocation of users to experiment arms diverges from the planned ratio by more than chance allows — for example a 50/50 test that lands far from 50/50. It signals a bug in assignment, logging, or filtering, and a test with SRM should not be trusted regardless of how good the headline result looks.
- Randomization unit
The randomization unit is the thing you randomly assign to control or treatment: a user, a session, a device, a cookie, or a cluster. The choice must match how you analyse and how users experience the change. Mismatches cause two classic failures — a user flipping variants between sessions (inconsistent experience) and analysing at a finer grain than you assigned (understated variance, false significance).
- Bot traffic in analytics: filtering it out
Bots — crawlers, scrapers, monitors, scanners — generate requests that, unfiltered, inflate pageviews and distort every metric. Client-side analytics often misses bots (many do not run JavaScript) or miscounts the ones that do. Server-side classification at ingest is the reliable way to keep bot traffic out of human reports.
- Bot intelligence
Separate bot contamination from a real assignment bug.
Sources and verification notes
- Fabijan et al. — Diagnosing Sample Ratio Mismatch in Online Controlled Experiments (KDD)Peer-reviewed SRM diagnosis taxonomy.
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.