Privacy & compliance

Differential privacy

Differential privacy is a mathematical framework that bounds how much any single person's data can affect a published result, by injecting carefully calibrated random noise. It lets you release useful aggregate statistics while provably limiting what can be learned about any individual. This page explains the core idea and where it appears in analytics.

Verified against primary sources

The core guarantee

A randomised analysis is differentially private if its output distribution barely changes whether or not any one individual's record is included. The 'barely' is quantified by a privacy-loss parameter, epsilon: a smaller epsilon means stronger privacy and more noise; a larger epsilon means weaker privacy and more accuracy. The mechanism typically adds noise (for example from a Laplace or Gaussian distribution) calibrated to a query's sensitivity.

Because the guarantee holds regardless of an attacker's side knowledge, it is robust against many re-identification attacks that defeat ad-hoc anonymisation.

Where it shows up

Differential privacy underpins parts of several privacy-preserving systems and is used to publish statistics — for example, it was applied to protect census data and appears in some browser and platform measurement features that report aggregates. In analytics, it lets you share counts and trends while bounding individual exposure, at the cost of added noise that is largest for small or finely sliced segments.

Epsilon (ε) tunes the privacy-accuracy trade-off
Noise is calibrated to query sensitivity
Strong against side-knowledge re-identification attacks

How it appears in analytics and logs

Differentially private outputs are intentionally approximate; large aggregates stay accurate while tiny segments can be noise-dominated, which is the privacy guarantee working as designed.

Diagnostic use case

Recognise when aggregates are differentially private — and that small or sliced segments carry more noise — so you read noised counts appropriately.

What WebmasterID can help detect

The intuition behind differential privacy — protect individuals while publishing aggregates — mirrors WebmasterID's aggregate-first reporting posture.

Common mistakes

Reading noised small-segment counts as exact.
Assuming any noise added equals differential privacy.
Ignoring the cumulative privacy budget across many queries.

Privacy and accuracy notes

Differential privacy is a privacy-protective technique. This page is educational; it explains the mechanism, not a guarantee that any given deployment is correctly configured.

↑ All privacy topics in Privacy & compliance

Sources and verification notes

Harvard — Differential Privacy (privacytools.seas.harvard.edu)Accessible primer on the formal definition.
NIST — Differential Privacy (SP 800-226 / blog series)Standards-body explanation of the technique.

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.