All terms
Glossary

sitemap.xml

An XML file that lists important URLs on a site so search engines can discover and prioritize them.

Sitecheck Team

A sitemap.xml is a structured file that lists the URLs on a site you want search engines to know about, along with metadata such as <lastmod> and, optionally, <changefreq> and <priority>. The format is defined by the sitemaps.org protocol and supported by Google, Bing, and other major crawlers.

Why it matters

Sitemaps speed up the discovery of new and updated pages, especially on large sites or those with deep navigation. They help direct crawl budget toward URLs that actually matter, reducing the risk that important content stays unindexed for weeks. For sites with many parameterized URLs, a sitemap is the cleanest way to point crawlers at the canonical versions you want ranked.

How to check

  • List only canonical, indexable URLs that return 200. Exclude anything blocked by robots.txt, marked noindex, or behind a redirect.
  • Keep each sitemap under 50,000 URLs and 50 MB uncompressed; split larger sites and reference shards from a sitemap index file.
  • Update <lastmod> accurately so crawlers can prioritize recently changed pages.
  • Reference the sitemap from robots.txt with Sitemap: https://example.com/sitemap.xml.
  • Submit the sitemap in Google Search Console and Bing Webmaster Tools and monitor coverage reports.

See also