WebmasterID logoWebmasterID
AI crawlers

Google-Extended — Google AI training control

Google-Extended is not a crawler or a user-agent string. It is a robots.txt token that lets site owners control whether their content is used to improve Google's generative AI models such as Gemini and Vertex AI. Googlebot continues to crawl for Search normally regardless of the Google-Extended setting.

Verified against primary sources

What this means

Google-Extended is a robots.txt user-agent token Google provides so site owners can control whether their content helps improve Google's generative AI products, including Gemini and Vertex AI generative APIs. It is a policy switch, not a crawler.

Crucially, setting a Google-Extended rule does not change Googlebot's crawling for Search. Your pages can still be crawled and ranked in Search while being excluded from generative AI training. Google documents the two controls as independent.

How to use Google-Extended

Because Google-Extended is not a fetcher, it never appears as a user agent in server logs. You use it only in robots.txt. To opt out of generative AI training site-wide:

User-agent: Google-Extended Disallow: /

This is a request honoured by Google for generative AI training use. It does not affect Googlebot, Google-InspectionTool, or other Google tokens, which you control separately.

How it appears in analytics and logs

You will not see Google-Extended as a user agent in your logs — it is a robots.txt control token, not a fetcher. Its presence in your robots.txt signals a policy choice about AI-training use, not a crawl event.

Diagnostic use case

Opt your content in or out of Google's generative AI training via robots.txt without affecting how Googlebot crawls your site for Search.

What WebmasterID can help detect

WebmasterID focuses on observed crawl traffic; because Google-Extended is a control token rather than a fetcher, it will not appear as bot events. WebmasterID can still help you see Googlebot's actual crawl activity, which Google-Extended does not change.

Common mistakes

Privacy and accuracy notes

Google-Extended is a robots.txt directive, not a request, so it involves no visitor data at all. It governs how Google may use already-crawled content, and concerns policy rather than identity.

Frequently asked questions

Does blocking Google-Extended hurt my Google Search ranking?
No. Google documents Google-Extended as controlling use of content for generative AI training only. Googlebot continues to crawl and index your site for Search independently of the Google-Extended setting.

Related pages

Sources and verification notes

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.