ClaudeBot — Anthropic's web crawler
ClaudeBot is the web crawler operated by Anthropic to fetch publicly available content. It is a declared crawler with a documented robots.txt token, and Anthropic publishes guidance for operators who want to identify or restrict it. It is separate from Claude-User, the agent that fetches pages when a person asks Claude to browse.
What this means
ClaudeBot is Anthropic's crawler for publicly accessible web content. Anthropic documents the crawler and the robots.txt token operators can use to control it. Allowing ClaudeBot lets Anthropic fetch your public pages; disallowing it asks Anthropic's crawler to stay out.
ClaudeBot is distinct from Claude-User, which represents a real-time fetch made when a person asks Claude to read a specific URL. Control them separately.
How ClaudeBot identifies itself
ClaudeBot uses the robots.txt user-agent token ClaudeBot. Its user-agent string contains the ClaudeBot token together with a self-identifying URL. Match on the stable token rather than a full version string.
As with every crawler, the user agent can be spoofed. Where Anthropic publishes verification guidance, use it for requests that need to be trusted.
- robots.txt token: ClaudeBot
- User agent contains the ClaudeBot token plus an Anthropic URL
- Separate token from Claude-User (real-time user-triggered fetch)
robots.txt considerations
ClaudeBot honours robots.txt. To disallow it site-wide:
User-agent: ClaudeBot Disallow: /
This affects only ClaudeBot. If you also want to restrict the real-time Claude-User fetch, target that token as well. robots.txt is honoured by compliant crawlers and is not an access-control mechanism.
How it appears in analytics and logs
A request carrying the ClaudeBot token is Anthropic's crawler fetching a URL — a bot event, not a human visit. As with any crawler, the user agent is a claim; treat sustained ClaudeBot activity as crawl coverage, not audience.
Diagnostic use case
Confirm whether ClaudeBot has crawled a page, and set robots.txt policy for Anthropic's crawler independently of other AI crawlers.
What WebmasterID can help detect
WebmasterID classifies ClaudeBot server-side as an AI crawler and shows its crawl activity on the bot-intelligence and AI-visibility surfaces, so you can see Anthropic crawl coverage page by page without log parsing.
Common mistakes
- Assuming one rule covers both ClaudeBot and Claude-User — they are separate tokens.
- Treating crawler hits as human sessions in analytics.
- Expecting robots.txt to enforce access rather than request compliance.
Privacy and accuracy notes
ClaudeBot detection uses only the request user-agent. No human identity is involved. WebmasterID records the crawl as a bot event, separate from human analytics, and never attaches it to a visitor profile.
Related pages
- GPTBot — OpenAI's web crawler
GPTBot is the crawler OpenAI uses to fetch publicly available web content that may be used to help train its foundation models. It is a declared, well-documented crawler with a stable robots.txt token, and OpenAI publishes both documentation and an IP range list so operators can identify and control it.
- PerplexityBot — Perplexity's web crawler
PerplexityBot is the crawler operated by Perplexity to index publicly available web pages for its AI answer engine. Perplexity documents the crawler and its robots.txt token. It is separate from Perplexity-User, which fetches a page in real time in response to a user's question.
- robots.txt basics: what it does and what it cannot do
robots.txt is a plain-text file at your site root that tells compliant crawlers which paths they may request. This page covers the directives, how user-agent groups are matched, and the limits that trip people up: robots.txt is advisory, it does not hide pages from search, and it is not a security boundary.
- Bot intelligence
Deterministic categorisation of crawlers, search bots, and automation.
Sources and verification notes
- Anthropic — crawler and robots.txt guidanceDocuments ClaudeBot and how site owners can block it.
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.