Robots & crawl control

noindex in robots.txt is unsupported

Some operators once added a noindex line to robots.txt to keep pages out of search. It was never part of the standard, and Google announced it would stop honouring an unsupported robots.txt noindex from September 2019. This page explains why the directive does nothing in robots.txt and which mechanisms actually remove a page from the index.

Verified against primary sources

Why it never worked as standard

A robots.txt noindex was an unofficial directive that some crawlers experimentally honoured. It was never part of the Robots Exclusion Protocol, whose robots.txt rules govern crawling (Allow / Disallow), not indexing. In July 2019 Google announced it would stop supporting unhandled and unpublished rules including noindex in robots.txt, effective 1 September 2019.

Since then, a noindex line in robots.txt has no effect on Google. Other engines never reliably supported it either, so it should be treated as dead.

Supported ways to deindex

To remove a page from the index, use a directive the page itself carries, which means the page must remain crawlable so the directive can be read:

- A robots meta tag with noindex in the page HTML. - An X-Robots-Tag: noindex HTTP header, useful for non-HTML files like PDFs.

Do not combine these with a robots.txt Disallow on the same URL: if the page is blocked from crawling, the crawler cannot see the noindex and the URL can linger in results as a bare link. For urgent removals from Google, also use the Removals tool in Search Console.

Use a noindex robots meta tag in the page HTML
Use an X-Robots-Tag: noindex header for non-HTML files
Do not Disallow a URL you want noindexed — the directive must be crawlable

How it appears in analytics and logs

A noindex directive inside robots.txt is ignored by Google and is not part of the protocol. If pages you tried to deindex this way are still in results, the unsupported directive is the reason.

Diagnostic use case

Stop relying on a noindex line in robots.txt and switch to a supported deindexing method so pages you want hidden actually leave the index.

What WebmasterID can help detect

WebmasterID surfaces which crawlers reach a page, helping you confirm that a page you want deindexed is still crawlable enough for a noindex meta tag or header to be seen and applied.

Common mistakes

Adding noindex to robots.txt and expecting pages to drop from search.
Blocking a URL with Disallow while also relying on noindex — the crawler never sees the noindex.
Assuming the unofficial directive still works on any major engine.

Privacy and accuracy notes

This concerns indexing directives only and never involves visitor identity. WebmasterID records crawler fetches of the affected pages as bot events, separate from human analytics.

↑ All robots topics in Robots & crawl control

Sources and verification notes

Google — A note on unsupported rules in robots.txt (2019)Google stopped supporting noindex in robots.txt from 1 September 2019.
Google — Block search indexing with noindexSupported noindex via meta tag and X-Robots-Tag header.

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.