Robots & crawl control

How crawlers handle a redirected robots.txt

When /robots.txt returns a 3xx redirect, crawlers must decide whether to follow it. This page explains how Google follows robots.txt redirects, the hop limit, and why redirecting the file (especially cross-host) can lead to unexpected crawl behavior.

Verified against primary sources

How Google follows robots.txt redirects

Google documents that it follows at least five redirect hops for robots.txt. If the chain resolves to a 200 response with a robots.txt, Google uses that file. If redirects do not resolve to a file within the hop limit, Google treats it as a 404 for robots.txt (which it handles as allow-all).

So a robots.txt that redirects to a valid file generally works, but a long or looping chain can cause Google to give up and assume open crawling.

Google follows at least five redirect hops
A resolved 200 robots.txt is used
Unresolved redirects are treated like a 404 (allow-all)

Cross-host and migration risks

robots.txt is per host, so redirecting https://old.example.com/robots.txt to https://new.example.com/robots.txt means the crawler may apply rules intended for a different host. During migrations this can silently change which rules govern which hostname.

The safe pattern is to serve a real robots.txt directly at each host's root rather than relying on redirects. If a redirect is unavoidable, keep the chain short and confirm the resolved file contains the rules you intend for that host.

How it appears in analytics and logs

If rules you expect are not applied, a redirected /robots.txt is a candidate — the crawler may have followed the redirect to a different file, or stopped following after too many hops.

Diagnostic use case

Avoid breaking crawl control when migrating sites or consolidating hosts, where /robots.txt may end up redirecting instead of serving rules directly.

What WebmasterID can help detect

WebmasterID records crawler requests to /robots.txt and the pages they then fetch, helping you spot when a redirect causes the wrong (or no) rules to apply.

Common mistakes

Redirecting robots.txt across hosts and applying the wrong host's rules.
Creating a long or looping redirect chain that resolves to allow-all.
Assuming a redirected robots.txt always serves the rules you expect.

Privacy and accuracy notes

Redirect handling concerns the robots.txt request itself, not visitors. No personal data is involved in how a crawler resolves the file's location.

↑ All robots topics in Robots & crawl control

Sources and verification notes

Google — How Google interprets robots.txt (redirect handling)Documents following at least five hops and 404-equivalent fallback.

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.