Can you ever be 100% sure a visit is a bot or a human?

No. Identity on the web is declared, not guaranteed. The honest posture is a category with a confidence level: known-and-verified bot, declared bot, likely automation, or human. WebmasterID records what is observable server-side and never presents a probabilistic guess as a fact.

Why do bots pretend to be browsers?

Scrapers and automated clients copy a real browser's user-agent string to blend in. That is why the user agent alone cannot be trusted, and why verification (for crawlers that support it) and behavioural context matter.

No — much of it is essential. Search crawlers and AI crawlers need access to index and answer. The goal is not to block bots; it is to see them clearly, separate them from human analytics, and decide crawler policy deliberately.

Bot vs human

Bot vs human traffic: how to tell them apart

How to separate bot traffic from human visitors in analytics: what each signal means, why the line is fuzzy, how spoofing works, and how to read the difference in a privacy-safe way.

Part of the Web Crawler & Traffic Intelligence Encyclopedia.

Why the line is fuzzy

Every request to your site arrives with a declared identity, not a proven one. Browsers, crawlers, monitoring tools, and scrapers all speak the same protocol. Separating bot from human is therefore a question of evidence and confidence, not a binary you can read off a single header.

The useful framing is a spectrum: known-and-verified bots (a crawler that names itself and passes verification), declared bots (names itself, unverifiable), likely automation (behaves like a machine), and humans. WebmasterID categorises deterministically against a maintained signature list and leaves the uncertain in an honest “other” bucket.

Signals you can read

User-agent string

The most common signal — and the weakest on its own. Declared bots name themselves; malicious clients spoof a real browser. Useful for categorisation, never sufficient for trust.

Declared vs verified identity

Major crawlers publish a way to verify them (reverse DNS, published IP ranges). A user agent that claims to be Googlebot but fails verification is the interesting case.

Behavioural shape

Humans load assets, scroll, and dwell; many bots fetch HTML only, at machine cadence. This is directional, not proof — never label a fast visitor a bot on cadence alone.

Request context

Whether JavaScript executed, whether the request set referrers consistently, and which endpoints were hit. Privacy-safe signals only — no fingerprinting.

What WebmasterID does

The ingest layer classifies each request server-side into a category and keeps bot traffic out of your human analytics by default. AI crawlers and search crawlers each get their own observable surface. Nothing is fingerprinted; raw IPs are anonymised at the edge; user-agent strings from real visitors are not exposed. See Bot intelligence and AI visibility analytics.

Frequently asked questions

Can you ever be 100% sure a visit is a bot or a human?: No. Identity on the web is declared, not guaranteed. The honest posture is a category with a confidence level: known-and-verified bot, declared bot, likely automation, or human. WebmasterID records what is observable server-side and never presents a probabilistic guess as a fact.
Why do bots pretend to be browsers?: Scrapers and automated clients copy a real browser's user-agent string to blend in. That is why the user agent alone cannot be trusted, and why verification (for crawlers that support it) and behavioural context matter.
Is bot traffic bad?: No — much of it is essential. Search crawlers and AI crawlers need access to index and answer. The goal is not to block bots; it is to see them clearly, separate them from human analytics, and decide crawler policy deliberately.

Related reference hubs