Interaction effects between changes
An interaction effect occurs when the combined impact of two changes is not simply the sum of their individual impacts — one change alters how the other performs. Interactions matter when several experiments run on the same page at once, and they are the core reason multivariate testing exists. This page explains interactions and how concurrent tests can collide.
When effects are not additive
Two changes are independent if each adds the same lift regardless of the other. They interact when one change's effect depends on the other's state — a new headline might help only when paired with a new hero image. The combined result then differs, sometimes sharply, from adding the two solo lifts.
Concurrent tests can collide
Running many A/B tests on the same surface at once is usually fine when changes are unrelated, because randomisation averages over the other tests. But when two tests touch interacting elements, their estimates can be biased. The honest options are to isolate interacting tests or to fold them into one multivariate design.
- Additive: solo lifts simply sum
- Interacting: combined effect differs from the sum
- Multivariate testing measures interactions directly
Cost of measuring interactions
Estimating an interaction needs traffic in every combination of variants, so the sample requirement grows with the number of factors. That is why multivariate testing is data-hungry and why teams often test the highest-impact factors individually first, reserving full factorial designs for cases where interaction is genuinely suspected.
How it appears in analytics and logs
If two separately-winning changes underperform when combined, an interaction is at work — the effects are not additive and cannot be summed.
Diagnostic use case
When two changes plausibly affect each other, test them together (or watch for interaction) rather than assuming their individual lifts add up cleanly.
What WebmasterID can help detect
WebmasterID's first-party events let you measure outcomes per combination of changes, so you can detect when concurrent experiments interact rather than assuming independence.
Common mistakes
- Summing two solo lifts as if they always add.
- Running interacting tests in parallel without isolation.
- Launching a multivariate test without traffic for every cell.
Privacy and accuracy notes
Interactions are estimated from aggregate per-combination rates. Detecting them needs no personal data, only counts per variant combination.
Related pages
- Multivariate testing
Multivariate testing (MVT) changes several elements simultaneously and tests their combinations, so it can reveal interactions between elements that separate A/B tests miss. The cost is traffic: the number of combinations grows quickly, so each gets a thin slice of visitors. MVT is worth it only when you have ample traffic and genuinely suspect interactions.
- Confounding variables in conversion
A confounding variable is a third factor that affects both the thing you changed and the outcome you measured, producing a spurious association. Confounders are why 'we shipped X and conversions rose' is weak evidence — a campaign, a season, or a price change could be the real cause. Randomised experiments neutralise confounders by design. This page explains the concept and the defence.
- Feature flags and experiments
A feature flag is a runtime switch that turns functionality on or off for chosen users without a new deploy. Flags power gradual rollouts, kill switches, and — when the audience is split randomly and outcomes are measured — controlled experiments. Understanding the overlap keeps you from confusing a rollout (operational) with an experiment (measured comparison).
- Sample size in experiments
Sample size is the number of subjects per arm an experiment needs to detect a chosen effect with acceptable error rates. It is computed in advance from the baseline rate, the minimum effect worth detecting, and the false-positive and false-negative rates you accept. Too small and you miss real effects; running until 'it looks good' inflates false positives.
Sources and verification notes
- Wikipedia — Interaction (statistics)Non-additive effects and factorial designs.
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.