What Is X Robots Tag In SEO

The X-Robots-Tag is an HTTP response header that tells search engine crawlers how a URL or file may be indexed and displayed. It carries the same core directives as a meta robots tag, such as noindex or nofollow, but is delivered by the web server rather than written into page HTML.

That distinction is what trips people up. Most SEOs reach for robots.txt or a meta tag first, then wonder why a PDF keeps showing up in search, or why a sitewide rule they edited in the CMS does nothing. The X-Robots-Tag often sits one layer above the CMS, set in server config, a reverse proxy, or a CDN, so it never appears in the page source. Knowing where it lives, and what it can do, is the first step to actually controlling it.

How the X-Robots-Tag Works at the HTTP Level

When a server returns a file, it sends a set of HTTP response headers alongside the content. The X-Robots-Tag is one of those headers, and in the raw response it looks like X-Robots-Tag: noindex. Search engine crawlers read it during the fetch and apply the instruction to the resource.

Because the directive travels with the response, it can be applied to any file the server serves, including PDFs, images, video files, and downloadable assets that do not have a section for meta tags. This is the practical reason the header exists at all: meta robots cannot be placed inside a PDF. The HTTP layer can.

Headers can be scoped broadly, to an entire directory, a specific file type, or a single URL. That makes server-level rules easier to maintain on large sites than editing hundreds or thousands of individual pages. According to Google’s documentation on robots meta tags, any directive supported in a meta tag can be used in the X-Robots-Tag header as well, which is why the two are often treated as interchangeable. Setting them up correctly across a site is part of what Clickside handles routinely in technical SEO audits.

X-Robots-Tag vs robots.txt vs Meta Robots What Actually Differs

robots.txt

robots.txt controls crawl access. It tells crawlers which URLs they are allowed to visit, and it is checked before a page is fetched.

Meta robots

The meta robots tag is written inside the page HTML and gives indexing and display instructions once the page has been crawled. It works on resources that have editable HTML, and it lives in two familiar places:

  • Inside the of an HTML page as a tag.
  • On the HTTP response of an HTML page, where it can also be expressed as a header.

X-Robots-Tag

Does the same job as a meta robots tag, but lives in the HTTP header, so it covers PDFs, images, and other non-HTML files, and can be applied centrally at the server.

One production bug worth flagging: a URL blocked by robots.txt may never be fetched, which means an X-Robots-Tag noindex on that URL is never seen, and the page can stay indexed through links from elsewhere. The Robots Exclusion Protocol, formalized in RFC 9309, treats crawl blocking and index control as separate signals, not redundant ones.

Want a clear map of your X-Robots-Tag, meta robots, and robots.txt signals? Clickside runs focused technical audits that flag the conflicts in one report.

The Main Directives You Can Set with X-Robots-Tag

The header is not just a switch to keep pages out of the index. It carries a range of instructions, organized by what they control.

Indexing controls include noindex, which keeps a page out of the search index, and nofollow, which tells crawlers not to follow links on the page. The shorthand none typically means noindex, nofollow in most implementations.

Presentation controls shape how a result looks once indexed. MDN documents the full set, which includes noarchive (no cached link in search results), nosnippet (no text preview), max-snippet (a character limit on the preview), max-image-preview (limit on image preview size), and max-video-preview (a limit in seconds on video previews).

There is also a time-based control: unavailable_after tells search engines to stop showing the result after a specified date. That one is useful for campaign pages, event content, or anything with a built-in shelf life.

When X-Robots-Tag Is the Right Tool for the Job

Use it when the content you want to control is not editable HTML, for example PDFs, image files, or generated downloads, because meta tags cannot be placed inside those file types. Use it when you need a rule to apply across many URLs at once, such as an entire directory of thin or duplicate pages, where a single server-level rule is safer and easier to maintain than hundreds of page edits.

The general rule: if the directive is about how a result should appear, such as nosnippet or noarchive, and the underlying content should still be crawlable so search engines can fully render and evaluate it, the X-Robots-Tag is the right layer. If you want to keep crawlers out entirely, reach for robots.txt instead. If you want to remove something from the index and you control the HTML, the meta tag is simpler. Anything else, or anything that lives outside HTML, points to the header. Working out which directive belongs at which layer is the kind of decision the Clickside team handles routinely in technical SEO audits.

Start by Checking What Your Server Is Actually Sending

Before changing anything, inspect the real HTTP response headers of a few representative URLs using browser developer tools, a curl request, or an online header checker. X-Robots-Tag directives set at the server, CDN, or hosting layer never appear in the HTML source, which is exactly why Search Console sometimes reports a noindex the page does not seem to have.

The next step is to compare what your headers say against what you actually want indexed, and adjust server rules or meta tags one layer at a time so you can verify the effect after each change. That is the diagnostic that ties the whole topic back together: the directive is hidden in the response, and the only way to know it is doing what you think is to look at the response itself.

Need help reading what your server, CDN, and CMS are sending to crawlers? Reach out to Clickside for a practical indexing audit and a clear next step.