Multiple user-agent groups and precedence
A robots.txt file usually has several user-agent groups. A crawler does not combine them: it selects the one most specific group whose token matches its name, per RFC 9309. This page explains how that precedence works, how multiple User-agent lines share one group, and the merging rules that surprise people.
One group wins, not a merge
RFC 9309 specifies that a crawler selects the group whose user-agent token most specifically matches its product token, and follows only that group. It does not merge rules from multiple groups. The * group is the fallback, used only when no named group matches the crawler.
So if you have a Googlebot group and a * group, Googlebot follows the Googlebot group exclusively and ignores the * group — even for rules that appear only in *.
- Most specific matching token wins
- A crawler follows one group, never a merge of several
- * applies only when no named group matches
Multiple User-agent lines in one group
Within a single group you can list several User-agent lines before the Allow/Disallow rules; the rules then apply to all of those tokens:
User-agent: bingbot User-agent: AhrefsBot Disallow: /private/
This is different from having separate groups — here both tokens share the same rule block. When records for the same token appear more than once, compliant parsers generally combine their rules into one effective group, but it is clearer to keep each token's rules in a single place.
- Several User-agent lines can head one shared rule block
- That shares rules across the listed tokens
- Keep a token's rules in one group for clarity
How it appears in analytics and logs
If a crawler obeys rules you did not expect, check group precedence: it applies only its most specific matching group, ignoring rules placed in other groups including *.
Diagnostic use case
Structure several user-agent groups so each crawler gets exactly the rules you intend, and understand which group wins when more than one could match.
What WebmasterID can help detect
WebmasterID shows which crawlers reach which paths, so you can confirm the group you intended for a crawler is the one actually governing it.
Common mistakes
- Expecting a named crawler to also obey rules placed only in the * group.
- Splitting one token's rules across scattered groups and losing track.
- Assuming rules from multiple matching groups are merged for a crawler.
Privacy and accuracy notes
Group structure is public configuration. It involves no visitor data.
Related pages
- User-agent groups and matching in robots.txt
robots.txt rules are organised into user-agent groups. A crawler does not combine every group — it selects the single most specific group whose token matches its name, falling back to the * group only when no named group matches. Understanding this prevents rules that never apply.
- Allow only specific bots, block the rest
Sometimes you want only a few named crawlers to access your site and everyone else kept out. Because each crawler obeys only its single most specific matching group, you build this by giving the allowed crawlers their own permissive groups and putting a blanket Disallow in the * group — with important caveats.
- robots.txt basics: what it does and what it cannot do
robots.txt is a plain-text file at your site root that tells compliant crawlers which paths they may request. This page covers the directives, how user-agent groups are matched, and the limits that trip people up: robots.txt is advisory, it does not hide pages from search, and it is not a security boundary.
- Bot intelligence
Confirm which group actually governs each crawler.
Sources and verification notes
- RFC 9309 — Robots Exclusion ProtocolDefines most-specific-group selection and group structure.
- Google — How Google interprets robots.txt
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.