Traffic allocation in experiments
Traffic allocation decides what fraction of eligible users enter an experiment and how that fraction divides among variants. A 50/50 split between two arms maximises statistical power for a fixed sample; ramping exposure limits blast radius. Allocation is a deliberate trade-off between speed, risk, and the number of variants. This page explains the levers.
Two separate decisions
Allocation has two layers. First, what share of eligible traffic enters the experiment at all (you might expose only 20% while you watch for problems). Second, how that exposed traffic divides across the control and variants. Both affect how fast you can read a result.
Power and the even split
For two arms and a fixed total sample, an equal 50/50 split gives the most statistical power, because power depends on the smaller arm. Skewing traffic toward the variant does not speed up a clean A/B test — it starves the control and slows detection. Uneven splits make sense for other reasons (limiting risk), not for power.
- Equal split maximises power for two arms
- More variants means each arm gets less traffic
- Low total exposure trades speed for safety
Checking the realised split
The intended allocation and the observed allocation can diverge. A large gap between expected and actual arm sizes is the signature of sample ratio mismatch — often a redirect, caching, or bot-filtering bug that quietly biases the result. Always verify the counts before trusting the readout.
How it appears in analytics and logs
An uneven split or a tiny allocation slows detection. If variant counts are wildly off from the intended split, suspect a sample-ratio-mismatch bug rather than chance.
Diagnostic use case
Choose an allocation that balances power and risk: even splits detect effects fastest for a given total, while small initial exposure limits damage from a bad variant.
What WebmasterID can help detect
WebmasterID's first-party event stream lets you verify how many visitors actually landed in each arm, so you can confirm the realised allocation matches the intended split.
Common mistakes
- Skewing traffic to the variant expecting faster results.
- Splitting across many arms without enough traffic to power each.
- Ignoring a gap between intended and observed arm sizes.
Privacy and accuracy notes
Allocation assigns visitors to arms, typically by a hashed bucket. It can be done without storing personal identifiers, using a first-party bucketing key.
Related pages
- Ramp-up and staged rollout
Ramping is the practice of increasing a variant's exposure in stages — say 1%, then 5%, 20%, 50% — pausing at each step to check guardrail metrics for harm. It separates risk control (the ramp) from measurement (the experiment). A ramp limits blast radius but the early, small stages are not powered to measure the effect precisely. This page explains the trade-off.
- Sample ratio mismatch (SRM)
Sample ratio mismatch (SRM) is when the observed allocation of users to experiment arms diverges from the planned ratio by more than chance allows — for example a 50/50 test that lands far from 50/50. It signals a bug in assignment, logging, or filtering, and a test with SRM should not be trusted regardless of how good the headline result looks.
- Sample size in experiments
Sample size is the number of subjects per arm an experiment needs to detect a chosen effect with acceptable error rates. It is computed in advance from the baseline rate, the minimum effect worth detecting, and the false-positive and false-negative rates you accept. Too small and you miss real effects; running until 'it looks good' inflates false positives.
- How long to run an A/B test
An A/B test runs until it has collected the sample size its design requires — derived from the baseline rate, the minimum detectable effect, and the chosen power. Duration also has to span full business cycles (weekday/weekend) to avoid day-of-week bias. Stopping the moment a result looks significant inflates false positives. This page explains how duration is set honestly.
Sources and verification notes
- Wikipedia — Statistical powerPower depends on the smaller arm; even split is efficient.
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.