PerplexityBot — Perplexity's web crawler
PerplexityBot is the crawler operated by Perplexity to index publicly available web pages for its AI answer engine. Perplexity documents the crawler and its robots.txt token. It is separate from Perplexity-User, which fetches a page in real time in response to a user's question.
What this means
PerplexityBot is Perplexity's crawler for indexing public web content used to answer questions in its product. Perplexity documents the crawler and the robots.txt token. Allowing it lets Perplexity index your public pages.
It is distinct from Perplexity-User, the real-time fetch made when a user asks Perplexity about a specific page. Control the two tokens independently.
How PerplexityBot identifies itself
PerplexityBot uses the robots.txt user-agent token PerplexityBot, and its user-agent string contains that token plus a self-identifying URL. Match on the stable token. As always, the user agent is a claim and can be spoofed; use Perplexity's published verification guidance where authenticity matters.
- robots.txt token: PerplexityBot
- Separate token from Perplexity-User (real-time fetch)
- User agent contains the token plus a Perplexity URL
robots.txt considerations
PerplexityBot honours robots.txt. To disallow it site-wide:
User-agent: PerplexityBot Disallow: /
This targets only the indexing crawler. robots.txt is a request honoured by compliant crawlers, not an enforcement boundary.
How it appears in analytics and logs
A request bearing the PerplexityBot token is Perplexity's indexing crawler fetching a URL — a bot event. Distinguish it from Perplexity-User real-time fetches, which are triggered by a person's query.
Diagnostic use case
Confirm PerplexityBot crawl coverage of a page and set robots.txt policy for Perplexity's indexing crawler.
What WebmasterID can help detect
WebmasterID classifies PerplexityBot server-side as an AI crawler and surfaces its crawl activity on the bot-intelligence and AI-visibility surfaces, so Perplexity coverage is observable per page.
Common mistakes
- Conflating PerplexityBot (indexing) with Perplexity-User (real-time fetch).
- Reading crawl spikes as human-traffic growth.
Privacy and accuracy notes
Detection uses only the request user-agent. No human identity is involved. WebmasterID records the crawl as a bot event, separate from human analytics.
Related pages
- GPTBot — OpenAI's web crawler
GPTBot is the crawler OpenAI uses to fetch publicly available web content that may be used to help train its foundation models. It is a declared, well-documented crawler with a stable robots.txt token, and OpenAI publishes both documentation and an IP range list so operators can identify and control it.
- ClaudeBot — Anthropic's web crawler
ClaudeBot is the web crawler operated by Anthropic to fetch publicly available content. It is a declared crawler with a documented robots.txt token, and Anthropic publishes guidance for operators who want to identify or restrict it. It is separate from Claude-User, the agent that fetches pages when a person asks Claude to browse.
- robots.txt basics: what it does and what it cannot do
robots.txt is a plain-text file at your site root that tells compliant crawlers which paths they may request. This page covers the directives, how user-agent groups are matched, and the limits that trip people up: robots.txt is advisory, it does not hide pages from search, and it is not a security boundary.
- AI referrals
Track human visits that arrive from AI assistants and answer engines.
Sources and verification notes
- Perplexity — crawler documentationDocuments PerplexityBot and Perplexity-User tokens.
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.