Bot traffic in analytics: filtering it out
Bots — crawlers, scrapers, monitors, scanners — generate requests that, unfiltered, inflate pageviews and distort every metric. Client-side analytics often misses bots (many do not run JavaScript) or miscounts the ones that do. Server-side classification at ingest is the reliable way to keep bot traffic out of human reports.
What this means
A large share of web requests are automated. Left in the data, they inflate pageviews, depress engagement and conversion rates, and create phantom referrers and spikes. Clean analytics depends on separating bots from people.
Why where you filter matters
Client-side analytics relies on a script running in the browser — many bots never run it, so they are invisible; the ones that do can be miscounted as humans. Server-side classification sees every request and can categorise it (search bot, AI crawler, automation, human) before it ever reaches a human report.
- Many bots never run client-side JS — invisible there
- JS-running bots can be miscounted as human
- Server-side classification sees and sorts every request
How it appears in analytics and logs
An unexplained traffic spike with no engagement is often bots. Whether your reports show it depends on where and how bot filtering happens.
Diagnostic use case
Separate bot traffic from human analytics so metrics reflect people, and investigate bot activity on its own surface rather than as noise.
What WebmasterID can help detect
WebmasterID classifies traffic server-side at ingest, so bot pageviews are kept out of human analytics by default and visible separately on the bot-intelligence surface.
Common mistakes
- Assuming client-side analytics already excludes all bots.
- Reading a no-engagement spike as a real audience.
- Deleting bot data instead of separating it for analysis.
Privacy and accuracy notes
Bot classification reads the request user agent and server-side signals, not visitor identity. WebmasterID classifies at ingest and never fingerprints humans to tell them from bots.
Related pages
- Analytics sampling: when reports estimate
Sampling is when an analytics tool computes a report from a fraction of the data and extrapolates. It keeps big queries fast, but it adds estimation error — worst for small segments and rare events, where a few sampled sessions get scaled into a confident-looking number. Knowing when a report is sampled is the first defence.
- Pageviews: what the metric counts
A pageview is recorded when a page is loaded (or a virtual page is rendered in a single-page app). It is the oldest web-analytics metric and the easiest to misread: pageviews count loads, not people, and modern apps and prefetching can inflate or hide them. This page defines the metric and its caveats.
- Diagnosing a bot traffic spike
A sudden spike in traffic is often bots, not audience. The diagnostic question is which bots: a verified crawler doing a fresh crawl wave, or spoofers and scrapers impersonating known crawlers. Separating verified crawlers from impostors by user-agent token and verification keeps your human analytics honest.
- Bot intelligence
Bots categorised and kept out of human analytics.
Sources and verification notes
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.