Bayesian A/B testing
Bayesian A/B testing treats the conversion rate of each arm as an unknown with a probability distribution. It combines a prior belief with observed data to produce a posterior, from which you can state things like 'the probability that B beats A is high' and quantify the expected loss of choosing wrong. It is an alternative framing to the frequentist p-value, with different assumptions rather than a guarantee of more truth.
What this means
In the Bayesian framing, each arm's true conversion rate is unknown and described by a probability distribution. You start with a prior (a belief before data, often deliberately weak), update it with the observed conversions and exposures, and get a posterior distribution per arm. From the posteriors you can compute the probability that one arm exceeds another and the expected loss of picking the apparent winner if it turns out to be wrong.
How it differs from frequentist tests
A frequentist test asks how surprising the data would be if there were no difference, and returns a p-value and confidence interval. A Bayesian test asks, given the data and a prior, what is the probability of each hypothesis. The Bayesian answer is the one most people intuitively want, but it requires choosing a prior, and a strong prior can dominate small samples.
Neither framing removes the need for an adequate sample size or for stable, well-defined metrics. Both can mislead if you stop the moment a threshold is crossed.
- Prior + data → posterior distribution per arm
- Reports P(B beats A) and expected loss directly
- Sensitive to the prior, especially on small samples
How it appears in analytics and logs
A posterior probability that B beats A, or an expected-loss figure, tells you how confident the data and prior make you — not a binary significant/not-significant verdict. It still depends on the prior and the model being reasonable.
Diagnostic use case
Use Bayesian analysis when you want a directly interpretable probability that a variant is better, and an expected-loss estimate to bound the risk of the decision.
What WebmasterID can help detect
WebmasterID measures the first-party conversion and exposure events that feed either a Bayesian or a frequentist analysis; the method is a choice you make on top of the same counts.
Common mistakes
- Assuming Bayesian results are immune to underpowered samples.
- Using a strong prior and not disclosing its influence.
- Treating 'probability B beats A' as a guaranteed business outcome.
Privacy and accuracy notes
Bayesian testing operates on aggregate counts of conversions and exposures, not personal profiles. This page is educational, not statistical consulting.
Related pages
- Frequentist vs Bayesian experiment analysis
Frequentist and Bayesian are two coherent ways to analyse the same experiment data. Frequentist methods ask how likely the observed data is under a null hypothesis and report p-values and confidence intervals. Bayesian methods combine a prior with the data to report posterior probabilities and credible intervals. Each has assumptions and failure modes; neither is universally 'correct'.
- A/B testing fundamentals
An A/B test randomly assigns visitors to a control (A) or a variant (B), shows each group one version, and compares a pre-chosen metric. Random assignment is what lets you attribute a difference to the change rather than to who happened to see it. The discipline is in deciding the metric and sample size before you start, not after you peek at the numbers.
- Statistical significance and p-values
A result is 'statistically significant' when it would be unlikely if there were really no effect. The p-value is the probability of seeing data at least as extreme as yours assuming the null hypothesis is true — it is not the probability the variant is better, and not a measure of how big the effect is. Significance and practical importance are different questions.
- Event Explorer
Inspect the conversion events behind each arm.
Sources and verification notes
- Wikipedia — Bayesian inference (method overview)
- NIST/SEMATECH e-Handbook — Bayesian methodsBackground on priors and posteriors.
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.