BigQuery user_id vs pseudo_id
In the GA4 BigQuery export, user_pseudo_id is the device/instance identifier and user_id is the optional ID you set for logged-in users. They count different things: pseudo_id resets when storage clears, while user_id can unify a person across devices. Treating them interchangeably miscounts users. This page explains the two identifiers and how each affects user counts in the export.
Two identifiers, two meanings
user_pseudo_id is assigned per device/instance and lives in client storage, so clearing cookies, a new browser, or ITP expiry produces a new one for the same person. user_id is a value you assign — usually after login — and can persist across devices, letting the same person be recognized everywhere they sign in. Many export rows have a pseudo_id and only some have a user_id.
Counting distinct pseudo_ids answers 'how many device instances'; counting distinct user_ids answers 'how many identified people'.
- user_pseudo_id: per-device, resets on storage clear
- user_id: optional, set for identified users, cross-device
- Not every row has a user_id
Counting without conflating
Pick one identifier for a given user metric and state which. A union or fallback (COALESCE(user_id, user_pseudo_id)) mixes identified and pseudonymous populations and inflates distinct counts, because one person can contribute both. For cross-device analysis use user_id where present; for raw reach use pseudo_id and accept its resets. Reconcile against the GA4 UI, which applies its own identity space.
This is an identity-definition issue, separate from late data or schema drift.
How it appears in analytics and logs
User counts that swing when storage is cleared reflect user_pseudo_id resets; counts that unify across devices come from a set user_id.
Diagnostic use case
Count users from the GA4 export correctly by choosing user_id or user_pseudo_id deliberately and not blending them.
What WebmasterID can help detect
WebmasterID's privacy-first model favors aggregate counts that do not depend on stitching identifiers across devices.
Common mistakes
- COALESCE-ing user_id and pseudo_id into one user count.
- Assuming user_pseudo_id is a stable person identifier.
- Comparing export user counts to the UI without aligning identity space.
Privacy and accuracy notes
user_id is typically a logged-in identifier with stronger privacy obligations than the pseudonymous id; handle each under its own rules. Educational, not legal advice.
Related pages
- New vs returning misclassification
New-vs-returning depends on recognising the same visitor across visits, which relies on a stored identifier. When that identifier is missing — cleared cookies, tracking prevention, a different device or browser, or declined consent — a returning visitor is recorded as new. The result over-states 'new' visitors and understates loyalty. This page explains the failure modes.
- BigQuery vs UI discrepancies
When GA4's BigQuery export and the reporting interface show different totals, it is usually not a bug. The UI applies sampling, data thresholds, (other) aggregation, and behavioral/conversion modeling on top of the raw event stream; BigQuery exports the unmodeled, unsampled events. Knowing which transformations the UI adds explains most gaps.
- User deletion and report effects
Honouring deletion requests and data-retention limits removes user-level data from analytics. Aggregate reports built on standard processing are largely unaffected, but user-scoped explorations, audiences, and the raw export can shrink as records are removed. Understanding what deletion touches prevents misreading a privacy action as a data fault. This page explains deletion's report effects. Educational, not legal advice.
- Privacy-first analytics
Count reach without cross-device identity stitching.
Sources and verification notes
- Google — [GA4] BigQuery Export schema (user fields)
- Google — [GA4] User-ID for cross-platform analysis
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.