Source dimension: the origin half of traffic attribution
Source is the dimension that names where a visit came from: a search engine (google), a referring domain, a named newsletter. It is the origin half of source/medium. Tools set source from the utm_source parameter or, lacking that, from the hostname of the referrer. When neither exists the source becomes '(direct)'. Source is high-cardinality, which has practical reporting consequences.
What this means
Source is the specific origin of a visit — the host name of the referrer (e.g. 'bing.com') or the value of utm_source (e.g. 'spring_newsletter'). It pairs with medium to form the canonical source/medium dimension; alone it answers 'who sent them', not 'what kind of channel'.
The reserved value '(direct)' covers visits with no referrer and no tag.
Cardinality and segmentation caveats
Source is one of the higher-cardinality dimensions: every referring host and every utm_source string is a distinct value. High cardinality can push long-tail values into an '(other)' bucket in some reports and slows ad-hoc exploration. Aggregating source up into channels keeps reports readable.
Referral spam inflates source cardinality with fake hostnames, so unexplained new sources deserve scrutiny before they are treated as real audiences.
- utm_source overrides the referrer hostname
- No referrer + no tag => '(direct)'
- High cardinality can trigger an '(other)' rollup
How it appears in analytics and logs
A source value names the origin host or campaign source. A surge in '(direct)' source usually signals stripped referrers or untagged links, not loyalty; a flood of new sources can indicate referral spam.
Diagnostic use case
Use source to attribute traffic to specific origins, while pairing it with medium so 'google / organic' and 'google / cpc' are not collapsed together.
What WebmasterID can help detect
WebmasterID records the referring origin and utm_source first-party at ingest, so legitimate sources are attributed without cross-site identifiers.
Common mistakes
- Reading source without medium and merging paid and organic Google.
- Treating referral-spam hostnames as genuine sources.
- Ignoring the '(other)' rollup when source cardinality is high.
Privacy and accuracy notes
Source derives from the referrer hostname or a campaign tag, not from a person. WebmasterID reads it first-party and does not fingerprint to reconstruct missing sources.
Related pages
- Medium dimension: the channel-type half of origin
Medium is the dimension that records the general category of how a visit arrived: organic, cpc, referral, email, affiliate, and so on. It is the channel-type half of source/medium. In GA4 and earlier tools it is set by the utm_medium parameter or inferred from the referrer, and it feeds channel grouping. The distinction between an empty medium, 'none', and '(not set)' trips up many reports.
- First user source dimension
The first user source dimension records the origin of a user's very first session — their acquisition source — and keeps it fixed for the user's lifetime. GA4 sets it from the referrer or campaign on the first visit. It is user-scoped, so it answers 'how did we acquire this person?' rather than 'where did this visit come from?', and confusing it with session source distorts attribution.
- Direct traffic: what it really means
Direct traffic is the bucket analytics uses when no referrer is available. It includes genuine type-ins and bookmarks, but also a large share of visits whose referrer was stripped — app opens, HTTPS-to-HTTP transitions, shorteners, and privacy settings. Treating 'direct' as a single intent is the classic analytics mistake.
- Attribution analytics
Attribute traffic to real origins across AI, search, and referral.
Sources and verification notes
- Google Analytics Help — [GA4] Dimensions and metricsDefines source and the reserved '(direct)' value.
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.