Robots & crawl control

robots.txt and page rendering

Google indexes the rendered version of a page, fetched in a second pass by its Web Rendering Service. robots.txt rules that block render-critical resources cause the renderer to skip them, producing an incomplete rendered DOM. This page explains the rendering pipeline and how robots.txt interacts with it.

Verified against primary sources

How rendering interacts with robots.txt

Google crawls in two phases: it first fetches the HTML, then queues the page for rendering, where a headless Chromium executes the page and fetches the resources it references. Every resource the renderer wants — JavaScript, CSS, fonts, images, and data fetched by client-side calls — is itself subject to robots.txt.

If a resource's URL is disallowed, the renderer does not fetch it. The page then renders without that resource, so content injected by a blocked script, or layout from blocked CSS, is missing from the rendered DOM that Google indexes.

Rendering is a separate phase that fetches page resources
Each resource URL is checked against robots.txt
Blocked resources are simply omitted from the rendered page

What to keep crawlable for rendering

Keep render-critical resources allowed: the JS that builds main content, the CSS that defines layout, fonts that affect rendering, and any same-site or third-party data endpoints the page calls during render. Blocking a JSON endpoint that supplies a page's content can leave the rendered page empty even though the HTML looks fine.

Verify with URL Inspection's "Test live URL": review the rendered screenshot, the rendered HTML, and the list of resources Google could not load. Anything flagged as blocked by robots.txt that affects content should be allowed.

How it appears in analytics and logs

A rendered page missing content usually means the renderer could not fetch a resource. robots.txt-blocked resources show up as omitted in URL Inspection's rendered output and resource list.

Diagnostic use case

Understand how robots.txt rules shape what Google's renderer can fetch, so blocked JS, CSS, fonts, or API responses do not silently degrade the indexed page.

What WebmasterID can help detect

WebmasterID records which resource paths crawlers request, so you can see whether Googlebot's renderer reached the JS, CSS, and data endpoints a page needs.

Common mistakes

Blocking a data endpoint the page calls during render, leaving content missing.
Assuming the raw HTML is what Google indexes — it indexes the rendered DOM.
Ignoring the blocked-resources list in URL Inspection's rendered output.

Privacy and accuracy notes

Rendering topics concern resource files and crawl behavior, not visitors. No personal data is involved in which resources the renderer may fetch.

↑ All robots topics in Robots & crawl control

Sources and verification notes

Google — understand the JavaScript SEO basics (rendering)Two-phase crawl/render and resource fetching during rendering.
Google — how Google interprets robots.txtResource URLs are subject to robots.txt during rendering.

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.