WebmasterID logoWebmasterID
Conversion & funnels

Feature flags and experiments

A feature flag is a runtime switch that turns functionality on or off for chosen users without a new deploy. Flags power gradual rollouts, kill switches, and — when the audience is split randomly and outcomes are measured — controlled experiments. Understanding the overlap keeps you from confusing a rollout (operational) with an experiment (measured comparison).

Verified against primary sources

What this means

A feature flag (feature toggle) decouples deploying code from releasing behaviour: the code ships dark and a flag decides who sees it at runtime. Flags serve several jobs — gradual percentage rollouts, instant kill switches, targeting specific segments, and experimentation. When the flag assigns users randomly and you compare a metric between the on and off groups, the flag is delivering an A/B test.

Rollout versus experiment

A rollout and an experiment can use identical flag plumbing but answer different questions. A rollout asks 'can we safely turn this on for everyone?' and ramps the percentage while watching for breakage. An experiment asks 'does this change the metric versus not having it?' and requires random assignment, a control group, a pre-declared metric, and enough sample for a valid comparison.

Conflating them is a common error: ramping a flag to 100% because nothing broke is not evidence the change improved anything. Only the measured comparison gives that.

How it appears in analytics and logs

A flag that is rolled out to a growing percentage is operational delivery. The same flag with random assignment and a measured outcome against a held-back group is a controlled experiment — the difference is the analysis, not the switch.

Diagnostic use case

Use flags to ship safely and to deliver experiment variants, but only call it an experiment when assignment is random and a metric is compared with a control.

What WebmasterID can help detect

WebmasterID measures the first-party events that tell you what each flagged cohort did, which is the data an experiment built on flags needs to be evaluated.

Common mistakes

Privacy and accuracy notes

Flag assignment and experiment analysis rely on aggregate cohorts, not personal profiling. This page is educational.

Related pages

Sources and verification notes

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.