Applebot-Extended — Apple AI training control
Applebot-Extended is a robots.txt token Apple provides so site owners can opt out of having their content used to train Apple's generative AI models. It is a control, not a separate crawler: Applebot remains the user agent that powers Apple search features and Siri, and it keeps crawling regardless of the Applebot-Extended setting.
What this means
Applebot-Extended is a robots.txt user-agent token Apple provides so site owners can control whether their content is used to train Apple's generative AI models. It is an opt-out switch layered on top of the existing Applebot crawler.
Applebot itself is the user agent Apple uses to power features such as Search and Siri. Setting an Applebot-Extended rule does not stop Applebot from crawling — it only governs the AI-training use of content Applebot has fetched.
How to use Applebot-Extended
Applebot-Extended is not a fetcher, so it does not appear as a user agent in server logs. You use it only in robots.txt. To opt out of generative AI training site-wide:
User-agent: Applebot-Extended Disallow: /
This is a request Apple honours for generative AI training. It does not change Applebot's crawling for search and Siri, which you control via the separate Applebot token.
- robots.txt token: Applebot-Extended
- Not a separate crawler; Applebot is the actual user agent
- Applebot keeps crawling for Search and Siri regardless
How it appears in analytics and logs
Applebot-Extended does not appear as a user agent in your logs — it is a robots.txt control token. Applebot is the fetcher you will actually see; Applebot-Extended only signals a training-use policy choice.
Diagnostic use case
Opt your content out of Apple's generative AI training via robots.txt while still allowing Applebot to crawl for Apple search and Siri.
What WebmasterID can help detect
Because Applebot-Extended is a control token rather than a fetcher, it will not appear as bot events. WebmasterID can help you observe Applebot's actual crawl activity, which the Applebot-Extended setting does not change.
Common mistakes
- Expecting Applebot-Extended to show up in logs — it is a control token, not a fetcher.
- Assuming an Applebot-Extended Disallow stops Applebot from crawling — it governs AI training use only.
- Confusing the Applebot and Applebot-Extended tokens — they are controlled separately.
Privacy and accuracy notes
Applebot-Extended is a robots.txt directive, not a request, so it involves no visitor data. It governs how Apple may use already-crawled content for AI training, which is a policy matter, not an identity one.
Related pages
- Google-Extended — Google AI training control
Google-Extended is not a crawler or a user-agent string. It is a robots.txt token that lets site owners control whether their content is used to improve Google's generative AI models such as Gemini and Vertex AI. Googlebot continues to crawl for Search normally regardless of the Google-Extended setting.
- ClaudeBot — Anthropic's web crawler
ClaudeBot is the web crawler operated by Anthropic to fetch publicly available content. It is a declared crawler with a documented robots.txt token, and Anthropic publishes guidance for operators who want to identify or restrict it. It is separate from Claude-User, the agent that fetches pages when a person asks Claude to browse.
- Web crawlers reference
Reference for crawlers, control tokens, and how they appear in traffic.
Sources and verification notes
- Apple — Applebot and Applebot-Extended documentationDocuments Applebot and the Applebot-Extended training control.
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.