The Sitemap directive in robots.txt
The Sitemap directive points crawlers at your XML sitemap. It uses an absolute URL, can appear multiple times to list several sitemaps, and works independently of your allow/disallow rules — it is a discovery hint, not a crawl-permission rule.
How the Sitemap line works
A Sitemap directive gives the full, absolute location of a sitemap:
Sitemap: https://example.com/sitemap.xml
Unlike Allow and Disallow, a Sitemap line is not tied to a user-agent group — it can appear anywhere in the file and applies globally. Google documents that the URL must be absolute (including the scheme), not a relative path.
Multiple sitemaps and independence
You can list several Sitemap lines, for example one per content type or a sitemap index:
Sitemap: https://example.com/sitemap-posts.xml Sitemap: https://example.com/sitemap-pages.xml
The directive is independent of your allow/deny rules: listing a URL in a sitemap does not override a Disallow, and disallowing a path does not remove it from a sitemap automatically. Keep the two consistent so you do not advertise URLs you also block.
- URL must be absolute, including https://
- Not tied to a user-agent group — applies globally
- Multiple Sitemap lines are allowed
How it appears in analytics and logs
A Sitemap line tells crawlers where to find your URL list. It does not grant or deny access to any path; allow/disallow rules still govern what may be crawled.
Diagnostic use case
Advertise one or more sitemaps to crawlers from robots.txt so they can discover your URLs more reliably.
What WebmasterID can help detect
WebmasterID shows which URLs crawlers actually fetch, so you can see whether adding a sitemap improved discovery of pages you care about.
Common mistakes
- Using a relative path instead of an absolute URL.
- Listing a URL in the sitemap while also disallowing it in robots.txt.
- Assuming a Sitemap line grants crawl permission — it does not.
Privacy and accuracy notes
Your robots.txt and the sitemaps it lists are public. Do not advertise sitemaps that expose paths you intend to keep private.
Related pages
- robots.txt basics: what it does and what it cannot do
robots.txt is a plain-text file at your site root that tells compliant crawlers which paths they may request. This page covers the directives, how user-agent groups are matched, and the limits that trip people up: robots.txt is advisory, it does not hide pages from search, and it is not a security boundary.
- User-agent groups and matching in robots.txt
robots.txt rules are organised into user-agent groups. A crawler does not combine every group — it selects the single most specific group whose token matches its name, falling back to the * group only when no named group matches. Understanding this prevents rules that never apply.
- Website observability
See which sitemap URLs crawlers actually fetch.
Sources and verification notes
- Google — How Google interprets robots.txtDocuments the Sitemap directive and absolute-URL requirement.
- sitemaps.org — Submitting via robots.txt
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.