How to test your robots.txt
A robots.txt rule is only useful if it does what you think. This page covers how to test it — checking the live file, using Google Search Console's robots.txt report and URL Inspection, and confirming in your own logs that the intended crawlers are or are not fetching the affected URLs.
Check the file and use Google's tools
First confirm the live file: fetch https://yourdomain/robots.txt and check it returns 200 with the rules you expect. Then use Google Search Console: its robots.txt report shows the fetched file and parsing issues, and the URL Inspection tool reports whether a specific URL is allowed or blocked for Googlebot and why.
These tools tell you which rule Google applies to a URL, which is exactly what you need when a pattern or group precedence is ambiguous.
- Fetch the live robots.txt and confirm a 200 response
- Use Search Console's robots.txt report for parsing issues
- Use URL Inspection to test a specific URL against Googlebot
Confirm the real-world effect
A tester predicts behaviour; logs confirm it. After deploying a change, watch whether the affected crawlers actually stop or start fetching the URLs in question. A compliant crawler should follow the rule within a crawl cycle; if it does not, recheck your token spelling, path casing, and group precedence.
For non-Google crawlers, rely on the crawler's own documentation plus observed behaviour, since each may parse edge cases slightly differently.
- Testers predict; observed crawl behaviour confirms
- Allow a crawl cycle before judging the effect
- Recheck token, casing, and precedence if behaviour differs
How it appears in analytics and logs
Testing tells you which rule a crawler applies to a given URL. If the tool's verdict differs from your expectation, your pattern, casing, or group precedence is likely off.
Diagnostic use case
Confirm a new or changed robots.txt rule blocks or allows exactly the URLs you intended, before relying on it.
What WebmasterID can help detect
WebmasterID shows which crawlers actually fetch which paths after a change, complementing pre-deploy testers with real observed behaviour.
Common mistakes
- Editing robots.txt but never confirming the live file returns it with a 200.
- Trusting a tester's verdict without confirming the real crawl effect later.
- Testing only Googlebot and assuming other crawlers parse identically.
Privacy and accuracy notes
Testing tools read your public robots.txt and report rule matches. They involve no visitor data.
Related pages
- robots.txt basics: what it does and what it cannot do
robots.txt is a plain-text file at your site root that tells compliant crawlers which paths they may request. This page covers the directives, how user-agent groups are matched, and the limits that trip people up: robots.txt is advisory, it does not hide pages from search, and it is not a security boundary.
- robots.txt common mistakes
Most robots.txt problems come from a handful of recurring mistakes. This page collects the big ones — blocking the CSS and JS crawlers need to render, trying to deindex with Disallow, advertising secret paths, and treating an advisory file as enforcement — with the correct approach for each.
- robots.txt path matching and case sensitivity
robots.txt path rules are compared against the URL path, and that comparison is case-sensitive: /Page and /page are different. This page covers how Google matches paths, why case and encoding matter, and how trailing characters and wildcards change the rule that applies.
- Website observability
Confirm the real crawl effect of a robots.txt change.
Sources and verification notes
- Google — robots.txt report in Search ConsoleDocuments how to review and test robots.txt for Googlebot.
- Google — How Google interprets robots.txt
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.