All Articles
5 min read
AI AgentsAI CrawlersAXOCloudflarellms.txt

CLOUDFLARE NOW SERVES YOUR WEBSITE AS MARKDOWN TO AI AGENTS

AUTHOR
Slobodan "Sani" Manic

SLOBODAN "SANI" MANIC

No Hacks

CXL-certified conversion specialist and WordPress Core Contributor helping companies optimise websites for both humans and AI agents.

I wrote recently about how AI agents see your website. The short version: they don't see your beautiful design. They parse HTML, read the accessibility tree, and strip away everything visual to get to the content underneath.

The problem is that "everything visual" is most of what HTML contains.

A simple ## About Us heading in markdown costs roughly 3 tokens. Its HTML equivalent, <h2 class="section-title" id="about">About Us</h2>, burns 12-15. Add the <div> wrappers, nav bars, script tags, and boilerplate that pad every real web page, and the waste adds up fast.

Cloudflare just decided to fix this at the infrastructure level.

What Cloudflare announced

Markdown for Agents is a new feature that converts HTML to markdown on the fly, at the CDN layer. When an AI agent requests a page and includes Accept: text/markdown in its headers, Cloudflare intercepts the request, fetches the HTML from the origin server, converts it to clean markdown, and serves that instead.

No changes to your website. No new endpoints to build. The conversion happens at the edge, automatically.

Cloudflare's own blog post drops from 16,180 tokens in HTML to 3,150 in markdown. That's an 80% reduction in token usage. For AI systems processing millions of pages, that's a significant cost and latency saving.

The response includes an x-markdown-tokens header with the estimated token count, so agents can calculate context window usage before processing the content. It also includes structured YAML frontmatter with the page title, description, and image URL.

Already happening

This isn't theoretical. Cloudflare notes that popular coding agents, including Claude Code and OpenCode, already send Accept: text/markdown headers with their requests. The demand exists. Cloudflare is meeting it at the infrastructure layer.

And they're tracking it. Cloudflare Radar now shows the distribution of content types served to AI bots. The current breakdown: 75.2% HTML, 8.4% markdown, 7% JSON. That markdown number will grow.

Here's the part that matters more than the markdown conversion itself.

Every markdown response includes a Content-Signal header: ai-train=yes, search=yes, ai-input=yes. These are part of Cloudflare's Content Signals framework, which lets website owners express preferences for how their content gets used.

Three signals:

  • search: content can be used for traditional search indexing
  • ai-input: content can be fed into AI models for real-time answers (RAG, grounding, generative search)
  • ai-train: content can be used for training or fine-tuning models

This is machine-readable consent attached to the content itself. When you enable Markdown for Agents, Cloudflare includes permissive defaults, but custom policies are coming.

This matters because the current situation is messy. robots.txt was built for crawlers, not for distinguishing between "index my page" and "train your model on my content." Content Signals gives publishers more granular control. Whether AI companies will respect those signals is another question, but the framework is sound.

The discoverability gap

There's one thing Cloudflare's approach doesn't solve: how does an agent know markdown is available?

The feature relies entirely on content negotiation. The agent has to proactively send Accept: text/markdown and hope the server supports it. There's no way for a page to advertise "I have a markdown version" to agents that haven't asked.

Joost de Valk, the original creator of Yoast SEO, noticed this gap. His WordPress plugin takes a different approach: it adds <link rel="alternate" type="text/markdown"> tags to page headers and creates dedicated .md URLs for each post. Agents can discover markdown availability through standard HTML link relations, no guesswork required.

The two approaches are complementary. Cloudflare handles the conversion at scale. Joost's <link rel="alternate"> approach handles discoverability. If you're running WordPress behind Cloudflare, you could use both.

What this means

Cloudflare handles roughly 20% of all web traffic. When a company at that scale builds infrastructure specifically for serving content to AI agents, it validates what I've been saying: the shift from a human-only web to a human-plus-agent web is real and accelerating.

But markdown is just the content layer. It solves the "reading" problem, where agents can consume your content more efficiently. It doesn't solve the "doing" problem, where agents need to interact with your website, fill forms, complete purchases. That's where standards like WebMCP come in.

Think of it as a stack:

  • Content layer: Markdown for Agents, llms.txt, structured data
  • Interaction layer: WebMCP, MCP, agentic browser APIs
  • Consent layer: Content Signals, robots.txt

We're watching each layer get built out in real time. If you've been following what we cover on the podcast and in this blog, none of this should be surprising. But the speed at which infrastructure providers are moving should be.

If you're on a Cloudflare Pro plan or above, enable Markdown for Agents today. It's free, it's in beta, and it takes one toggle. Your content is already being consumed by AI agents. You might as well make it easier for them.

QUESTIONS ANSWERED

NEW TO NO HACKS?

AI agents are becoming your next visitors. No Hacks is a weekly podcast exploring how to optimize websites for this new reality, with practical strategies from SEOs, developers, and AI researchers.

Subscribe Now