User agents

Bot vs browser user agents: how to tell them apart

A user-agent string is a self-reported label, not an identity. This page explains how declared bots name themselves, why almost every UA still starts with the legacy Mozilla token, and how to read the difference between an automated client and a real browser without over-trusting the string.

Verified against primary sources

What a user-agent string is

The User-Agent is an HTTP request header the client sends to describe itself. It is defined by the HTTP specification, but its contents are entirely under the client's control. Anything — a browser, a crawler, a script, a monitoring tool — can send any string it likes.

That is the key fact: the user agent is a claim, not a credential.

How declared bots identify themselves

Well-behaved crawlers name themselves clearly, usually with a product token and a URL pointing at their documentation (for example a token plus a +https://… link). This is how you recognise the major search and AI crawlers in logs. The token is the stable part; version numbers change.

Look for a clear product token (Googlebot, bingbot, GPTBot, …)
A self-identifying URL in the string signals a declared crawler
Match on the token, not the full version string

Why browsers all start with 'Mozilla/5.0'

Almost every browser user agent begins with the legacy Mozilla/5.0 token for historical compatibility reasons. It tells you nothing useful on its own — it does not mean Firefox. Read the rest of the string (engine, platform, browser tokens) and remember that a scraper can reproduce all of it.

How it appears in analytics and logs

A user agent that names a crawler and links to its documentation is a declared bot. A string that mimics a full browser may be a real browser or a scraper copying one — the string alone cannot tell you which.

Diagnostic use case

Decide whether a request is automated or a real browser when triaging traffic, and know when the user agent is enough versus when you need verification.

What WebmasterID can help detect

WebmasterID parses the user agent server-side into a category (search bot, AI crawler, automation, browser) against a maintained signature list, and leaves unknown clients in an honest 'other' bucket rather than guessing.

Common mistakes

Treating the user-agent string as proof of identity.
Blocking on a substring match and accidentally catching legitimate clients.
Storing raw user-agent strings of real visitors when a category would do.

Privacy and accuracy notes

User-agent parsing for classification is privacy-safe when you store a category rather than the raw string of real visitors. WebmasterID classifies at ingest and does not expose raw user-agent strings from human traffic.

↑ All user-agent families in User agents

Sources and verification notes

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.