Data quality

BigQuery vs UI discrepancies

When GA4's BigQuery export and the reporting interface show different totals, it is usually not a bug. The UI applies sampling, data thresholds, (other) aggregation, and behavioral/conversion modeling on top of the raw event stream; BigQuery exports the unmodeled, unsampled events. Knowing which transformations the UI adds explains most gaps.

Verified against primary sources

What this means

GA4 has two surfaces: the reporting UI (and Data API), which serves aggregated, sometimes sampled and modeled results, and the BigQuery export, which lands raw events. The two are derived from the same collection but pass through different processing, so identical questions can return different numbers.

Why the numbers diverge

Several UI-only transformations explain most gaps. Sampling kicks in on large, ad-hoc explorations. Data thresholding withholds rows that could identify individuals, lowering UI totals. High-cardinality dimensions collapse into an (other) row. Behavioral and conversion modeling, plus consent-mode estimation, add modeled users and events the raw export never contains.

User counts are especially prone to differ: the UI uses HyperLogLog++ approximation for cardinality, while a COUNT(DISTINCT user_pseudo_id) in BigQuery is exact.

Sampling: UI explorations may sample; export does not
Thresholding: UI suppresses identifying rows; export keeps them
(other): high-cardinality dimensions collapse in the UI
Modeling/consent estimation: UI adds modeled events; export does not
User counts: UI approximates (HLL++); BigQuery COUNT(DISTINCT) is exact

How it appears in analytics and logs

A BigQuery total higher than the UI usually reflects thresholded rows the UI suppressed; a UI total higher than BigQuery often reflects modeled conversions or estimated users the raw export does not contain.

Diagnostic use case

Reconcile a number that disagrees between a Looker Studio/UI report and a SQL query over the BigQuery export by attributing the delta to a documented UI-only transformation.

What WebmasterID can help detect

WebmasterID reports from its own first-party event store, so the figures you read are the recorded events without a separate modeled layer reconciling against raw export.

Common mistakes

Assuming the UI and BigQuery export must match to the row.
Comparing approximate UI user counts to exact COUNT(DISTINCT) in SQL.
Forgetting thresholding and (other) when the UI total is lower.

Privacy and accuracy notes

The BigQuery export contains event-level rows; treat it as sensitive and govern access. This page is educational, not legal advice on data residency or retention.

↑ All data-quality topics in Data quality

Sources and verification notes

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.