Randomization unit
The randomization unit is the thing you randomly assign to control or treatment: a user, a session, a device, a cookie, or a cluster. The choice must match how you analyse and how users experience the change. Mismatches cause two classic failures — a user flipping variants between sessions (inconsistent experience) and analysing at a finer grain than you assigned (understated variance, false significance).
Unit must match experience and analysis
If you randomise by session but a user returns across sessions, they may see control today and treatment tomorrow — an inconsistent experience that muddies the effect. So the unit should usually be the most stable identifier that still gives each user one experience: a logged-in user or account ID where available, a first-party cookie otherwise. Whatever you choose, every observation from that unit stays in one arm.
- User/account: consistent experience across sessions
- Session/cookie: simpler but can flip a user's variant
- Cluster: groups whole communities to limit spillover
Analyse at the unit you assigned
A common error is randomising by user but computing significance per pageview or per session. Multiple correlated observations from the same user are not independent, so treating them as independent understates variance and inflates significance — a false positive engine. Aggregate to the randomisation unit first, or use methods (like clustered standard errors or the delta method) that account for the correlation.
The unit also bounds your defence against interference: cluster units limit spillover between users.
How it appears in analytics and logs
Analysing at a finer grain than the assignment unit understates variance and inflates significance; a too-fine unit also lets users see both variants.
Diagnostic use case
Pick the unit that gives each user a consistent experience and matches your analysis grain — usually a stable user or account ID, not the raw session.
What WebmasterID can help detect
WebmasterID's first-party identifiers let you assign and analyse on the same stable unit so variance is not understated.
Common mistakes
- Randomising by session, letting a returning user flip variants.
- Analysing per pageview when you assigned per user (understated variance).
- Building a 'stable' unit via fingerprinting.
Privacy and accuracy notes
Prefer first-party, privacy-safe identifiers for the unit; avoid fingerprinting to construct a stable ID.
Related pages
- Delta method for ratio metrics
Many experiment metrics are ratios where the denominator is itself random — clicks per session, revenue per user, pages per visit. When the randomisation unit is coarser than the denominator unit, the numerator and denominator are correlated, so naive variance formulas are wrong. The delta method uses a first-order Taylor expansion to approximate the variance of the ratio correctly, fixing confidence intervals.
- Network effects in experiments
Standard A/B tests assume each user's outcome depends only on their own assigned variant — the no-interference (SUTVA) assumption. Network effects break it: in social products, marketplaces, or anything with sharing, a treated user changes the experience of untreated users, so control is 'contaminated' and the measured effect is biased. Cluster, switchback, or ego-network designs reduce the leakage.
- Sample ratio mismatch (SRM)
Sample ratio mismatch (SRM) is when the observed allocation of users to experiment arms diverges from the planned ratio by more than chance allows — for example a 50/50 test that lands far from 50/50. It signals a bug in assignment, logging, or filtering, and a test with SRM should not be trusted regardless of how good the headline result looks.
- Privacy-first analytics
First-party identifiers without fingerprinting.
Sources and verification notes
- Kohavi, Tang, Xu — Trustworthy Online Controlled Experiments (book site)Standard reference on randomization unit and analysis-grain pitfalls.
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.