WebmasterID logoWebmasterID
Robots & crawl control

User-agent groups and matching in robots.txt

robots.txt rules are organised into user-agent groups. A crawler does not combine every group — it selects the single most specific group whose token matches its name, falling back to the * group only when no named group matches. Understanding this prevents rules that never apply.

Verified against primary sources

How matching works

Each group begins with one or more User-agent lines and is followed by Allow/Disallow rules. When a crawler reads robots.txt, it looks for the group whose user-agent token most specifically matches its own name. Per RFC 9309 and Google's documentation, the crawler applies only that single best-matching group — it does not merge rules across groups.

The * token is the default group: it applies only to crawlers that have no more specific group of their own. So if you have both a Googlebot group and a * group, Googlebot follows the Googlebot group and ignores the * group entirely.

A common pitfall

Because a named group fully replaces the * group for that crawler, rules you put only in * will not apply to a crawler that has its own group. For example, if your * group has Disallow: /private/ but you also add an empty Googlebot group, Googlebot may no longer be subject to that Disallow. Repeat the rules you need inside each named group rather than assuming they inherit.

How it appears in analytics and logs

If a crawler ignores a rule you expected, often the wrong group matched: a more specific group for that crawler took precedence over your * group, or vice versa.

Diagnostic use case

Structure robots.txt so each crawler gets the rules you intend, and avoid the trap of a * group silently overriding or being overridden.

What WebmasterID can help detect

WebmasterID shows which crawlers reach which paths, helping you confirm that the group you intended for a crawler is the one actually governing its behaviour.

Common mistakes

Privacy and accuracy notes

User-agent grouping is a public configuration choice. It involves no visitor data.

Related pages

Sources and verification notes

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.