Crawl diagnostics

HTTP 401 Unauthorized and crawling

401 Unauthorized means the request lacks valid authentication credentials for the resource. Crawlers do not log in, so a page behind a 401 cannot be fetched or indexed. Seeing 401s for content you intended to be public usually means an auth layer is misconfigured or applied too broadly.

Verified against primary sources

What 401 means

401 Unauthorized indicates the request has not been applied because it lacks valid authentication credentials. The response typically includes a WWW-Authenticate header describing how to authenticate. Despite the name, it is about authentication (who you are), where 403 is about authorization (whether you are allowed).

Crawlers do not authenticate, so a 401 stops them at the door.

Why crawlers cannot index 401 pages

A search or AI crawler issues anonymous requests. When it receives a 401, it has no credentials to supply and cannot retrieve the content, so that URL will not be indexed. This is correct for genuinely private pages.

The problem case is public content unintentionally behind a 401 — for example a staging auth rule left on production, or an access layer scoped too broadly. There the fix is to remove or narrow the authentication so public URLs are reachable.

401 = authentication required (who you are)
403 = authenticated but not allowed (authorization)
Crawlers send no credentials, so 401 content is not indexed

Operator checklist

Confirm public pages return 200 and are not behind an auth challenge. Check for staging/basic-auth rules accidentally active in production. Keep genuinely private content behind 401 deliberately, and do not rely on robots.txt alone to protect it.

How it appears in analytics and logs

A 401 means the resource requires authentication that the request did not provide. Crawlers receive the 401 and cannot proceed, so the page is not indexed. Unexpected 401s on public URLs point at a misapplied auth or access layer.

Diagnostic use case

Confirm that public pages are not accidentally behind a 401, and understand why authenticated content stays out of crawler indexes.

What WebmasterID can help detect

WebmasterID can surface URLs where crawlers receive 401s, helping you catch public pages accidentally placed behind authentication.

Common mistakes

Leaving staging basic-auth enabled on production, blocking crawlers.
Confusing 401 (authentication) with 403 (authorization).
Expecting crawlers to index pages they cannot authenticate into.

Privacy and accuracy notes

Status codes carry no personal data, and authentication challenges expose no visitor identity to WebmasterID. WebmasterID reports 401 patterns for crawler traffic without exposing individual visitors.

↑ All diagnostic topics in Crawl diagnostics

Sources and verification notes

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.