robots.txt and AMP pages
AMP pages depend on the AMP runtime, cached resources, and a crawlable canonical relationship. Disallowing AMP paths or required resources in robots.txt can break validation, caching, or discovery. This page explains which AMP-related resources must stay crawlable and how robots.txt interacts with AMP.
What AMP needs crawlable
AMP is HTML built on the AMP runtime and component scripts loaded from the AMP CDN, plus your own render-critical CSS and any referenced images. For an AMP page to validate and be eligible for caching, Google must be able to fetch the AMP page itself and the resources it relies on.
The canonical/AMP link relationship also needs to be crawlable: Google discovers an AMP page from the rel="amphtml" link on the canonical page, and the AMP page points back via rel="canonical". Blocking either page breaks that pairing.
- Keep the AMP page and its canonical both crawlable
- Do not block render-critical CSS/images the AMP page uses
- AMP runtime scripts load from the AMP CDN, not your robots.txt scope
Common robots.txt mistakes with AMP
A typical error is disallowing a path like /amp/ to keep AMP URLs out of search, which instead prevents Google from validating and serving them. If you want to retire AMP, remove the rel="amphtml" links and use proper redirects or noindex on the AMP URLs — not a robots.txt Disallow that hides the signals Google needs.
Validate with AMP testing tools and Search Console's AMP and URL Inspection reports; a blocked-resource warning indicates a robots.txt rule is interfering. Keep AMP pages crawlable as long as they are part of your live AMP setup.
How it appears in analytics and logs
If Google cannot fetch an AMP page or its resources, the AMP version may fail validation or not be served. A blocked-resource warning in AMP testing tools points to a robots.txt rule.
Diagnostic use case
Avoid robots.txt rules that block AMP pages or their required resources, which can stop AMP from validating, being cached, or being served.
What WebmasterID can help detect
WebmasterID records which paths crawlers fetch, so you can confirm Googlebot is reaching your AMP pages and their resources rather than being blocked.
Common mistakes
- Disallowing /amp/ and breaking AMP validation and serving.
- Blocking the canonical page, severing the rel=amphtml / rel=canonical pairing.
- Using robots.txt to retire AMP instead of removing amphtml links and redirecting.
Privacy and accuracy notes
AMP crawl topics concern page and resource files, not visitors. No personal data is involved in deciding which AMP paths are crawlable.
Related pages
- robots.txt and JavaScript/CSS files
Google renders pages with a headless browser before indexing, so it must fetch the JavaScript and CSS your page depends on. Disallowing those resources in robots.txt can prevent proper rendering and harm how the page is understood. This page explains why render-critical resources should stay crawlable.
- Canonical vs noindex: which to use
rel=canonical and noindex are often confused. Canonical tells search engines which of several similar URLs to treat as the primary, consolidating signals onto it. noindex removes a page from the index entirely. This page explains when each is right and why combining them on one URL sends conflicting signals.
- robots.txt and page rendering
Google indexes the rendered version of a page, fetched in a second pass by its Web Rendering Service. robots.txt rules that block render-critical resources cause the renderer to skip them, producing an incomplete rendered DOM. This page explains the rendering pipeline and how robots.txt interacts with it.
- Website observability
See which AMP paths crawlers reach on your site.
Sources and verification notes
- Google — AMP on Google Search guidelinesAMP discovery via amphtml/canonical links and crawlability requirements.
- Google — robots.txt introduction (do not block resources)Do not block resources needed to render/validate pages.
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.