WebmasterID logoWebmasterID
AI crawlers

HTTP response codes and AI crawlers

AI crawlers act on the HTTP status you return. A 200 invites ingestion; 301/308 moves them to a new URL; 403 or 401 signals refusal; 404/410 says the page is gone; 429 asks them to slow down; 5xx says try again later. Returning the right code is how you steer a compliant AI crawler without blunt blocking, and the wrong code can mislead it for a long time.

Verified against primary sources

Codes that invite or move crawling

A 200 OK is an invitation: the content is here, ingest it. A 301 or 308 permanent redirect tells the crawler the URL has moved for good, so it should follow and update its reference; a 302 or 307 signals a temporary move and the crawler keeps the original URL in mind.

Use redirects deliberately. A permanent move deserves 301/308 so the crawler consolidates on the new URL; using a temporary redirect for a permanent move keeps the crawler returning to the old address.

Codes that refuse, retire, or throttle

403 Forbidden and 401 Unauthorized signal refusal — the crawler is not allowed here. 404 Not Found says the page is missing; 410 Gone says it is intentionally and permanently removed, which is a stronger, faster signal to drop the URL. 429 Too Many Requests with Retry-After asks the crawler to slow down rather than go away.

Choosing precisely matters: 410 retires a URL faster than 404, and 429 throttles where 403 would read as a permanent block. Each code communicates a different intent to a compliant crawler.

Server errors and crawl behaviour

5xx responses tell a crawler your origin is having trouble. Compliant crawlers typically back off and retry later when they see sustained 5xx, which can slow your crawl coverage if errors persist. A flood of 500s during a crawl wave is both a reliability problem and a crawl-coverage problem.

Avoid serving 200 for error or empty pages — a soft-404 hides the problem and wastes crawl budget. Return honest status codes so crawlers behave the way the code intends.

How it appears in analytics and logs

The status your origin returns to an AI token tells you how it will likely behave next: repeated 200s mean ongoing ingestion, sustained 429s mean throttling is engaging, and 5xx spikes may slow crawling because the crawler treats your origin as unstable.

Diagnostic use case

Use precise HTTP status codes to guide AI crawlers — throttle with 429, retire URLs with 410, refuse with 403 — instead of relying on ambiguous responses that confuse their behaviour.

What WebmasterID can help detect

WebmasterID records the status returned to each AI token per URL, so you can confirm whether crawlers receive the codes you intend on the observability and bot-intelligence surfaces.

Common mistakes

Privacy and accuracy notes

Response-code handling concerns crawler behaviour and server policy, not visitor identity. Analysis keys on the crawler token and status returned; no human data is involved.

Related pages

Sources and verification notes

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.