robots.txt path matching and case sensitivity
robots.txt path rules are compared against the URL path, and that comparison is case-sensitive: /Page and /page are different. This page covers how Google matches paths, why case and encoding matter, and how trailing characters and wildcards change the rule that applies.
Case sensitivity and the path
Google documents that robots.txt rules apply to the URL path and that path matching is case-sensitive. So Disallow: /Folder/ does not block /folder/ — the casing must match. If your site serves the same content under multiple casings, you may need rules for each, or better, canonical URLs that avoid the ambiguity.
Matching is against the path portion of the URL, so include the leading slash and the exact segments you mean.
- Path matching is case-sensitive: /Page ≠ /page
- Rules apply to the URL path, starting with /
- Multiple casings may need multiple rules
Prefix behaviour and trailing characters
Without a wildcard, a path rule matches as a prefix: Disallow: /news matches /news, /news/, and /newsletter. Add a trailing slash (/news/) to scope to a folder, or anchor the end with $ where supported (Disallow: /report.pdf$) to match an exact ending.
Percent-encoding also matters: the path is compared as it appears, so be consistent about encoded characters. When several rules match, the most specific (longest) one wins between Allow and Disallow.
- No wildcard = prefix match (/news catches /newsletter)
- Use a trailing slash or $ to scope precisely
- Be consistent with percent-encoding in paths
How it appears in analytics and logs
If a rule does not match a URL you expected, a case or encoding mismatch is a common cause — robots.txt path matching is case-sensitive and operates on the path as written.
Diagnostic use case
Write path rules that match the exact URLs you intend, accounting for case sensitivity and prefix behaviour, so you neither over- nor under-block.
What WebmasterID can help detect
WebmasterID shows which paths crawlers fetch, so you can confirm a case-sensitive rule matches the exact URLs you meant and no others.
Common mistakes
- Writing /Folder/ and expecting it to block lowercase /folder/.
- Using a bare /news prefix that also blocks /newsletter.
- Ignoring percent-encoding differences in the path.
Privacy and accuracy notes
Path matching is public configuration. It involves no visitor data; do not list sensitive paths expecting them to be hidden.
Related pages
- Wildcards and path matching in robots.txt
Although the original protocol used simple prefix matching, major crawlers support two wildcards in path rules: * matches any sequence of characters, and $ anchors the end of the URL. This page covers how they behave, useful patterns, and the mistakes that make a rule too broad.
- robots.txt size limits and parsing
robots.txt files are not unlimited. Google documents a maximum parsed size of 500 KiB and ignores anything beyond it, which can silently drop rules at the bottom of a bloated file. This page covers the size limit and how parsing precedence — most specific rule wins — interacts with it.
- robots.txt basics: what it does and what it cannot do
robots.txt is a plain-text file at your site root that tells compliant crawlers which paths they may request. This page covers the directives, how user-agent groups are matched, and the limits that trip people up: robots.txt is advisory, it does not hide pages from search, and it is not a security boundary.
- Website observability
Confirm a path rule matches exactly the URLs you intended.
Sources and verification notes
- Google — How Google interprets robots.txtDocuments case-sensitive path matching and most-specific-rule precedence.
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.