Referral spam and ghost traffic
Referral spam and ghost traffic are fake hits crafted to appear in your reports. Crawler spam loads pages to leave a referrer in your logs; ghost spam sends hits straight to a measurement endpoint without ever visiting your site. Both add phantom sessions with no engagement. This page explains the mechanics and the filtering that removes them.
Crawler spam vs ghost spam
Crawler spam is a bot that actually requests your pages, leaving a chosen referrer string in your server logs and any log-based report. Ghost spam never touches your site at all: it fires hits directly at a measurement endpoint using a guessed or scraped property ID, so a fake session appears even though no page was ever loaded.
Ghost spam is the giveaway case for client-side tools — the hostname in the hit often does not match your domain, because the spammer never loaded your domain.
- Crawler spam: real request, fake referrer in logs
- Ghost spam: hit sent to the endpoint, no page load
- Hostname mismatch is a classic ghost-spam tell
How to keep it out
Filter by valid hostname so hits claiming to be your property but reporting a foreign hostname are dropped. Exclude known spam referrers, and treat any source with zero engagement and 100% single-interaction visits as suspect. Server-side classification removes the whole category earlier, because the endpoint is not a public target you can spoof into.
How it appears in analytics and logs
A referrer you do not recognise with near-100% single-page visits and no conversions is almost always spam, not a new traffic source.
Diagnostic use case
Recognise and exclude referral and ghost spam so referrer and acquisition reports reflect real traffic rather than injected noise.
What WebmasterID can help detect
WebmasterID classifies traffic server-side at ingest, so injected and crawler-driven hits are separated from human analytics rather than appearing as referrers.
Common mistakes
- Treating an unknown high-bounce referrer as a real channel.
- Filtering spam after the fact instead of validating hostname.
- Chasing a 'campaign' that is actually injected referrer text.
Privacy and accuracy notes
Spam filtering matches request and referrer patterns, not visitor identity. No personal data is required to exclude it.
Related pages
- Bot traffic in analytics: filtering it out
Bots — crawlers, scrapers, monitors, scanners — generate requests that, unfiltered, inflate pageviews and distort every metric. Client-side analytics often misses bots (many do not run JavaScript) or miscounts the ones that do. Server-side classification at ingest is the reliable way to keep bot traffic out of human reports.
- Self-referrals and lost attribution
A self-referral is when your own site shows up as a referring source in your reports. It usually means a session was broken and a new one started attributed to your domain, often when a visitor crosses subdomains or returns from a payment provider. Self-referrals fragment sessions and steal credit from the real source. This page explains the causes and the fix.
- Language spam and keyword spam
Language spam and keyword spam place messages — promotions, slogans, even instructions — into fields like browser language or a site-search term. The values are forged, sent by bots or crafted hits to be read by whoever opens the report. They are not real visitor attributes. This page explains how the injection works and how to filter and recognise it.
- Bot intelligence
Injected and crawler hits separated from human data.
Sources and verification notes
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.