Sitemap index files
A sitemap index file is a sitemap that lists other sitemaps, letting large sites stay within the 50,000-URL and 50MB-per-file limits while exposing all URLs through one submitted entry point. This page explains the sitemapindex format, the same per-file limits that apply to the index itself, and best practices for organizing and submitting multiple sitemaps.
What this means
A sitemap index uses the sitemapindex root element containing sitemap entries, each with a loc pointing to a child sitemap (and an optional lastmod). You submit the single index, and search engines discover all the child sitemaps from it.
This is the standard way to handle sites larger than one sitemap can hold, or to organize sitemaps logically (for example by section, content type, or update cadence).
When and how to split
Each sitemap file is limited to 50,000 URLs and 50MB uncompressed. When you exceed either limit, split URLs across multiple sitemaps and list them in an index. The index file itself is also limited to 50,000 child-sitemap entries and 50MB.
Split logically so each child sitemap is meaningful — by section, language, or content type — which also makes Search Console coverage easier to interpret per group. Keep absolute loc URLs on a single host.
- Root element: sitemapindex containing sitemap entries
- Each entry: loc (child sitemap URL) + optional lastmod
- Per-file limits (50,000 entries / 50MB) apply to the index too
- Split logically by section, language, or type
Submission and best practices
Submit the sitemap index in Search Console; you generally do not need to submit each child sitemap separately. You can also reference it from robots.txt with a Sitemap line.
Keep child sitemaps clean — only canonical, indexable, 200-status URLs — and keep the index's lastmod values meaningful so search engines can prioritize recently changed groups. Monitor each child sitemap's coverage to catch sections with crawl or indexing problems.
How it appears in analytics and logs
A sitemap index points search engines to multiple child sitemaps. It is a discovery convenience for large sites, not an indexing guarantee. The index file itself is bound by the same 50,000-entry and 50MB limits, and each child sitemap must follow the standard sitemap rules.
Diagnostic use case
Organize a large site's URLs across multiple sitemaps under one index, stay within format limits, and submit a single index in Search Console.
What WebmasterID can help detect
WebmasterID shows which URLs crawlers actually fetch, so you can compare real coverage against the full set advertised across all child sitemaps, and spot child sitemaps whose URLs crawlers never reach.
Common mistakes
- Exceeding 50,000 entries or 50MB in the index file itself, not just the children.
- Submitting every child sitemap individually instead of the single index.
- Letting child sitemaps include redirects, 404s, or non-canonical URLs.
- Using relative or cross-host loc URLs in the index.
Privacy and accuracy notes
Sitemap index files reference other sitemaps and public URLs only. They contain no visitor data. WebmasterID treats the resulting crawl activity as bot events.
Frequently asked questions
- How many sitemaps can an index list?
- Up to 50,000 child sitemaps, and the index file must not exceed 50MB uncompressed — the same limits that apply to a regular sitemap file.
- Do I submit the index or each child sitemap?
- Submit the single sitemap index in Search Console. Search engines discover the child sitemaps from it, so you generally do not need to submit each one separately.
Related pages
- XML sitemap best practices
An XML sitemap lists URLs you want crawled, helping search engines discover pages they might miss through links alone. The format has firm limits — 50,000 URLs and 50MB uncompressed per file — and works best when it contains only canonical, indexable, 200-status URLs with accurate lastmod values. This page covers the documented rules and the common quality problems that make a sitemap less useful.
- Sitemap lastmod accuracy
The lastmod element in a sitemap reports when a URL's content last changed. Google uses lastmod to prioritize recrawling only when the value is consistently accurate; if every URL shows the generation date or the homepage date, Google learns to distrust and ignore it. This page explains correct lastmod semantics, format, and the consequences of inaccuracy.
- Diagnosing XML sitemap errors
An XML sitemap helps search engines discover and prioritise your URLs, but a sitemap full of the wrong URLs sends mixed signals. Common errors include listing redirecting or non-200 URLs, including noindex or canonicalised-away pages, exceeding the 50,000-URL or 50 MB limits, or referencing the wrong protocol/host. A clean sitemap lists only canonical, indexable, 200-returning URLs.
- Website observability
Compare crawled URLs against the full set across all child sitemaps.
Sources and verification notes
- Google Search Central — Manage your sitemaps with sitemap index filesSitemap index format and the per-file limits.
- sitemaps.org — Sitemap index format
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.