User agents and bot scores
Bot-detection and WAF products often compute a bot score that estimates how likely a request is automated. The user agent is one input, but a weak one on its own: it is trivially spoofable and frequently blank or generic on legitimate clients. Over-weighting the UA leads to both missed bots and blocked humans.
What this means
A bot score is a probabilistic estimate that a request came from automation rather than a human, used by WAFs and anti-bot products to decide whether to allow, challenge, or block. These systems blend many inputs, and the user agent is one of them.
On its own the user agent is a weak signal. It is self-reported and trivially editable, so a scraper can present a perfect browser string while a legitimate API client honestly names its library. Treating the UA as decisive distorts the score.
Using the UA proportionately
The user agent is most useful in combination: a library token plus datacenter origin plus non-browser request patterns together make a confident automation signal, whereas any one alone does not. Conversely, a perfect browser UA paired with impossible navigation timing suggests spoofing.
Because exact scoring formulas are proprietary and vary by vendor, we do not assert specific weightings here. The durable principle is to corroborate the user agent with network and behavioural signals rather than letting the string dominate, and to keep an explainable basis for any block.
- UA is one input to a bot score, and a spoofable one
- Corroborate with network origin and request behaviour
- Over-weighting the UA causes both false allows and false blocks
How it appears in analytics and logs
A bot score derived heavily from the user agent will rate a spoofed browser string as human and an honest library token as bot. The UA contributes signal but cannot carry a scoring decision alone.
Diagnostic use case
Understand the user agent's limited role in bot scoring, avoid over-weighting it, and combine it with behavioural and network signals for better accuracy.
What WebmasterID can help detect
WebmasterID classifies traffic deterministically from the user agent and corroborating signals rather than emitting an opaque probability, so you can see exactly why a request was labelled a crawler, automation, or human.
Common mistakes
- Letting the user agent dominate a bot score.
- Trusting a perfect browser UA as proof of a human.
- Penalising honest library tokens while missing spoofed browsers.
Privacy and accuracy notes
Bot scoring should rest on request and behavioural signals, not on identifying the person behind a request. The user agent is coarse client metadata, and WebmasterID keeps bot classification deterministic and non-identifying.
Related pages
- Pitfalls of UA-based bot blocking
Blocking traffic by matching user-agent substrings is a tempting but flawed bot defence. Hostile clients simply spoof a browser user agent to slip past, while legitimate browsers, accessibility tools, and beneficial bots get caught by over-broad rules. UA blocklists are a weak, high-collateral control compared with behaviour and verification.
- Detecting automation from user agents
You can use the user agent as a first signal for spotting automation — tool tokens, headless markers, missing strings — but it is never conclusive, because any client can change it. Reliable detection pairs the UA with verification and behaviour, and records honest unknowns. This page explains a sound approach.
- Spoofed and fake user agents: what to watch for
Spoofing a user agent is trivial — any client can claim to be Googlebot or a normal browser. This page explains why spoofing happens, the common fake-crawler patterns, and the verification methods that turn a claimed identity into a confirmed one.
- Bot vs human
Deterministic bot-vs-human classification with an explainable basis.
Sources and verification notes
- MDN — Browser detection using the user agentDocuments UA spoofability and unreliability; basis for not over-weighting it. Vendor scoring formulas not asserted.
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.