noindex is a robots directive that tells search engines not to include a page in their index. You can apply it with a <meta name="robots" content="noindex"> tag in the page <head>, or as an X-Robots-Tag: noindex HTTP response header — useful for non-HTML resources like PDFs.
Why it matters
Applying noindex to thin, duplicate, or internal pages keeps your search presence focused on content that actually drives traffic. On large sites it also protects crawl-budget, since indexed but low-value URLs can dilute perceived site quality. Accidentally adding noindex to important pages, on the other hand, removes them from results entirely — a common cause of overnight traffic drops after a deploy or template change.
How to check
- Audit your site for unintentional
noindexafter every release, especially on staging-to-production promotions. - Do not block the URL in robots-txt at the same time — if the crawler cannot fetch the page, it cannot read the directive.
- Use
noindex, followwhen you want the page out of the index but still want crawlers to traverse its links. - Verify with the URL Inspection tool in Google Search Console; it reports the indexing decision and the directive seen.
- For canonical consolidation prefer a canonical-tag instead of
noindex, since canonicals pass signals to the chosen URL. - Keep
noindexpages out of your sitemap-xml — submitting them sends mixed signals.