A sitemap.xml is a structured file that lists the URLs on a site you want search engines to know about, along with metadata such as <lastmod> and, optionally, <changefreq> and <priority>. The format is defined by the sitemaps.org protocol and supported by Google, Bing, and other major crawlers.
Why it matters
Sitemaps speed up the discovery of new and updated pages, especially on large sites or those with deep navigation. They help direct crawl budget toward URLs that actually matter, reducing the risk that important content stays unindexed for weeks. For sites with many parameterized URLs, a sitemap is the cleanest way to point crawlers at the canonical versions you want ranked.
How to check
- List only canonical, indexable URLs that return
200. Exclude anything blocked by robots.txt, marked noindex, or behind a redirect. - Keep each sitemap under 50,000 URLs and 50 MB uncompressed; split larger sites and reference shards from a sitemap index file.
- Update
<lastmod>accurately so crawlers can prioritize recently changed pages. - Reference the sitemap from
robots.txtwithSitemap: https://example.com/sitemap.xml. - Submit the sitemap in Google Search Console and Bing Webmaster Tools and monitor coverage reports.