Crawl diagnostics

Diagnosing an unknown bot

An unknown bot is a client whose user-agent does not match a known crawler. The right response is to verify what you can and resist guessing: attributing an unfamiliar user-agent to a named operator without evidence is how bad data spreads. An honest other bucket is more useful than a confident wrong label.

Verified against primary sources

What an unknown bot is

An unknown bot is a request that looks automated but whose user-agent does not match any crawler you have documented. Some self-identify with a token and URL you simply have not catalogued yet; others are deliberately vague, generic, or spoofed.

The absence of a match is information in itself — it means you do not yet know who this is.

Verify, do not guess

The temptation is to assign an unfamiliar user-agent to a plausible operator. That is how inaccurate data spreads: a guess becomes a label, and the label gets trusted. Instead, verify what you can — does the user-agent contain a self-identifying URL or token you can look up? For clients claiming a known crawler, does the source pass that operator's published verification?

If verification does not resolve it, the correct outcome is to leave it unclassified. An honest other bucket preserves the integrity of every category you are confident about.

An unmatched user-agent is not evidence of a specific operator
Verify via self-identifying URLs and published methods
Keep the unresolved in an explicit other bucket

Operator checklist

Capture the user-agent token and any self-identifying URL. Look it up against documented crawlers. For ones claiming a known crawler, verify the source. If it does not resolve, keep it unclassified rather than guessing. Revisit the bucket periodically as you catalogue more crawlers.

How it appears in analytics and logs

An unknown bot is automated traffic with a user-agent that does not map to a documented crawler. It is not evidence of any specific operator; treat it as unclassified until verification proves otherwise.

Diagnostic use case

Handle an uncategorized user-agent responsibly — verify identity where possible and keep it in an honest unclassified bucket rather than guessing an operator.

What WebmasterID can help detect

WebmasterID classifies what it can verify and keeps the rest in an explicit unclassified bucket, so unknown bots are visible and honestly labelled rather than guessed into a named category.

Common mistakes

Guessing a named operator for an unfamiliar user-agent without evidence.
Collapsing all unknown bots into a known category, polluting its data.
Trusting a self-declared crawler token without verification.

Privacy and accuracy notes

Diagnosis uses the user-agent string and published verification methods only — never visitor identity, fingerprinting, or raw IP addresses. WebmasterID keeps unclassified bots in a transparent other bucket and does not invent attribution.

↑ All diagnostic topics in Crawl diagnostics

Sources and verification notes

Google Search Central — Verifying Googlebot and other crawlersDocuments verifying a crawler rather than trusting the user agent.
MDN — User-Agent header

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.