Robots & crawl control

robots.txt and AMP pages

AMP pages depend on the AMP runtime, cached resources, and a crawlable canonical relationship. Disallowing AMP paths or required resources in robots.txt can break validation, caching, or discovery. This page explains which AMP-related resources must stay crawlable and how robots.txt interacts with AMP.

Verified against primary sources

What AMP needs crawlable

AMP is HTML built on the AMP runtime and component scripts loaded from the AMP CDN, plus your own render-critical CSS and any referenced images. For an AMP page to validate and be eligible for caching, Google must be able to fetch the AMP page itself and the resources it relies on.

The canonical/AMP link relationship also needs to be crawlable: Google discovers an AMP page from the rel="amphtml" link on the canonical page, and the AMP page points back via rel="canonical". Blocking either page breaks that pairing.

Keep the AMP page and its canonical both crawlable
Do not block render-critical CSS/images the AMP page uses
AMP runtime scripts load from the AMP CDN, not your robots.txt scope

Common robots.txt mistakes with AMP

A typical error is disallowing a path like /amp/ to keep AMP URLs out of search, which instead prevents Google from validating and serving them. If you want to retire AMP, remove the rel="amphtml" links and use proper redirects or noindex on the AMP URLs — not a robots.txt Disallow that hides the signals Google needs.

Validate with AMP testing tools and Search Console's AMP and URL Inspection reports; a blocked-resource warning indicates a robots.txt rule is interfering. Keep AMP pages crawlable as long as they are part of your live AMP setup.

How it appears in analytics and logs

If Google cannot fetch an AMP page or its resources, the AMP version may fail validation or not be served. A blocked-resource warning in AMP testing tools points to a robots.txt rule.

Diagnostic use case

Avoid robots.txt rules that block AMP pages or their required resources, which can stop AMP from validating, being cached, or being served.

What WebmasterID can help detect

WebmasterID records which paths crawlers fetch, so you can confirm Googlebot is reaching your AMP pages and their resources rather than being blocked.

Common mistakes

Disallowing /amp/ and breaking AMP validation and serving.
Blocking the canonical page, severing the rel=amphtml / rel=canonical pairing.
Using robots.txt to retire AMP instead of removing amphtml links and redirecting.

Privacy and accuracy notes

AMP crawl topics concern page and resource files, not visitors. No personal data is involved in deciding which AMP paths are crawlable.

↑ All robots topics in Robots & crawl control

Sources and verification notes

Google — AMP on Google Search guidelinesAMP discovery via amphtml/canonical links and crawlability requirements.
Google — robots.txt introduction (do not block resources)Do not block resources needed to render/validate pages.

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.