User-agent string
The most common signal — and the weakest on its own. Declared bots name themselves; malicious clients spoof a real browser. Useful for categorisation, never sufficient for trust.
How to separate bot traffic from human visitors in analytics: what each signal means, why the line is fuzzy, how spoofing works, and how to read the difference in a privacy-safe way.
Part of the Web Crawler & Traffic Intelligence Encyclopedia.
Every request to your site arrives with a declared identity, not a proven one. Browsers, crawlers, monitoring tools, and scrapers all speak the same protocol. Separating bot from human is therefore a question of evidence and confidence, not a binary you can read off a single header.
The useful framing is a spectrum: known-and-verified bots (a crawler that names itself and passes verification), declared bots (names itself, unverifiable), likely automation (behaves like a machine), and humans. WebmasterID categorises deterministically against a maintained signature list and leaves the uncertain in an honest “other” bucket.
The most common signal — and the weakest on its own. Declared bots name themselves; malicious clients spoof a real browser. Useful for categorisation, never sufficient for trust.
Major crawlers publish a way to verify them (reverse DNS, published IP ranges). A user agent that claims to be Googlebot but fails verification is the interesting case.
Humans load assets, scroll, and dwell; many bots fetch HTML only, at machine cadence. This is directional, not proof — never label a fast visitor a bot on cadence alone.
Whether JavaScript executed, whether the request set referrers consistently, and which endpoints were hit. Privacy-safe signals only — no fingerprinting.
The ingest layer classifies each request server-side into a category and keeps bot traffic out of your human analytics by default. AI crawlers and search crawlers each get their own observable surface. Nothing is fingerprinted; raw IPs are anonymised at the edge; user-agent strings from real visitors are not exposed. See Bot intelligence and AI visibility analytics.
Related reference hubs