robots.txt vs the X-Robots-Tag header
X-Robots-Tag carries the same indexing directives as the meta robots tag, but in the HTTP response header instead of the HTML body. That makes it the way to apply noindex or nofollow to non-HTML resources like PDFs and images, where a meta tag has nowhere to live.
What X-Robots-Tag is
X-Robots-Tag is an HTTP response header that carries indexing directives such as noindex and nofollow, for example:
X-Robots-Tag: noindex
Google documents it as equivalent in effect to the meta robots tag, but delivered in the header. Because it lives in the response rather than the HTML, you can apply it to resources that have no HTML head — PDFs, images, plain-text files — and you can set it server-wide or per-path in your server configuration.
Same indexing layer as meta robots
Like the meta robots tag, X-Robots-Tag controls indexing, not crawling. The crawler must be allowed to fetch the resource to see the header, so do not pair it with a robots.txt Disallow on the same URL — that would hide the very header you are relying on.
Use robots.txt for crawl control, and choose between the meta tag (HTML pages) and X-Robots-Tag (non-HTML or bulk server rules) for index control.
- Header-level indexing control (noindex, nofollow)
- Works on non-HTML files where meta robots cannot
- Crawler must fetch the resource to read it — do not also Disallow it
How it appears in analytics and logs
An X-Robots-Tag in a response controls indexing of that resource. As with meta robots, the crawler must be able to fetch the resource to read the header — a disallowed URL's header is never seen.
Diagnostic use case
Apply noindex/nofollow to non-HTML files (PDFs, images, feeds) and to many URLs at once via server config, where a meta robots tag is not possible.
What WebmasterID can help detect
WebmasterID shows whether crawlers are still fetching a resource, helping you confirm the crawler can actually reach it to read your X-Robots-Tag header.
Common mistakes
- Disallowing a URL in robots.txt so the X-Robots-Tag header is never read.
- Expecting X-Robots-Tag to block crawling — it controls indexing only.
- Forgetting it is the only practical noindex route for PDFs and images.
Privacy and accuracy notes
X-Robots-Tag is an indexing signal, not access control. The resource still needs authentication to be truly private.
Related pages
- robots.txt vs the meta robots tag
robots.txt and the meta robots tag solve different problems. robots.txt asks crawlers not to fetch a path; the meta robots tag, embedded in a page's HTML, tells search engines whether to index it. The classic mistake is using Disallow to remove a page from search — which can backfire.
- The noindex meta tag
The noindex value of the meta robots tag tells search engines to keep a page out of their index. The catch trips people up constantly: for noindex to work, the crawler must be able to fetch the page — so you must not block the same URL in robots.txt.
- Website observability
Confirm crawlers can reach the resources you annotate.
Sources and verification notes
- Google — Robots meta tag and X-Robots-TagDocuments X-Robots-Tag header directives and use cases.
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.