Robots & crawl control

robots.txt vs the X-Robots-Tag header

X-Robots-Tag carries the same indexing directives as the meta robots tag, but in the HTTP response header instead of the HTML body. That makes it the way to apply noindex or nofollow to non-HTML resources like PDFs and images, where a meta tag has nowhere to live.

Verified against primary sources

What X-Robots-Tag is

X-Robots-Tag is an HTTP response header that carries indexing directives such as noindex and nofollow, for example:

X-Robots-Tag: noindex

Google documents it as equivalent in effect to the meta robots tag, but delivered in the header. Because it lives in the response rather than the HTML, you can apply it to resources that have no HTML head — PDFs, images, plain-text files — and you can set it server-wide or per-path in your server configuration.

Same indexing layer as meta robots

Like the meta robots tag, X-Robots-Tag controls indexing, not crawling. The crawler must be allowed to fetch the resource to see the header, so do not pair it with a robots.txt Disallow on the same URL — that would hide the very header you are relying on.

Use robots.txt for crawl control, and choose between the meta tag (HTML pages) and X-Robots-Tag (non-HTML or bulk server rules) for index control.

Header-level indexing control (noindex, nofollow)
Works on non-HTML files where meta robots cannot
Crawler must fetch the resource to read it — do not also Disallow it

How it appears in analytics and logs

An X-Robots-Tag in a response controls indexing of that resource. As with meta robots, the crawler must be able to fetch the resource to read the header — a disallowed URL's header is never seen.

Diagnostic use case

Apply noindex/nofollow to non-HTML files (PDFs, images, feeds) and to many URLs at once via server config, where a meta robots tag is not possible.

What WebmasterID can help detect

WebmasterID shows whether crawlers are still fetching a resource, helping you confirm the crawler can actually reach it to read your X-Robots-Tag header.

Common mistakes

Disallowing a URL in robots.txt so the X-Robots-Tag header is never read.
Expecting X-Robots-Tag to block crawling — it controls indexing only.
Forgetting it is the only practical noindex route for PDFs and images.

Privacy and accuracy notes

X-Robots-Tag is an indexing signal, not access control. The resource still needs authentication to be truly private.

↑ All robots topics in Robots & crawl control

Sources and verification notes

Google — Robots meta tag and X-Robots-TagDocuments X-Robots-Tag header directives and use cases.

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.