llms.txt is a plain-text file placed at the root of a website (yourdomain.com/llms.txt) that tells AI language models which pages are most relevant for training, indexing, and citation. It was proposed by Jeremy Howard of fast.ai in 2024 and has since been adopted by hundreds of sites including Cloudflare, Anthropic, and Perplexity itself.
The analogy to robots.txt is intentional but imprecise. robots.txt tells crawlers what not to index. llms.txt tells AI systems what to prioritise — it is an opt-in affirmative signal, not a restriction mechanism. A site without an llms.txt is still crawlable; a site with a well-structured one gives AI systems a curated map that increases the likelihood the right content is surfaced.
Which AI crawlers respect llms.txt
Adoption is growing but uneven. As of mid-2025, confirmed support includes:
| Crawler | Operator | llms.txt support |
|---|---|---|
| ClaudeBot | Anthropic | Yes |
| OAI-SearchBot | OpenAI | Partial (robots.txt primary) |
| PerplexityBot | Perplexity AI | Yes |
| GoogleOther | Under evaluation | |
| Applebot-Extended | Apple | No |
| ChatGPT-User | OpenAI | Partial |
The llms.txt format
The file uses a superset of Markdown. The llmstxt.org specification defines three required sections and two optional ones:
# Site nameH1: the name of your site or product> TaglineBlockquote: one-sentence description used as context## SectionH2: groups related pages together- [Title](URL): descriptionLinked list items: pages with brief descriptions## OptionalSections for: docs, API reference, examples
A template for marketing sites
Here is the structure we use on client projects. The goal is to give AI systems enough context to understand what the company does, who it serves, and where the authoritative content lives — in under 150 lines.
# Nous Frame > Independent web design studio. We design, build, and maintain > conversion-focused websites with editorial craft and technical precision. Nous Frame works with ambitious brands — primarily in tech, finance, and professional services — to ship websites that combine visual excellence with measurable commercial outcomes. ## Services - [Web Design & Development](/services): Custom-built marketing sites, landing pages, and web applications. No page builders. - [SEO & GEO Optimisation](/services#seo): Technical SEO, Core Web Vitals optimisation, and Generative Engine Optimisation for AI search. - [Ongoing Maintenance](/services#maintenance): Hosting, security, and iterative improvement post-launch. ## Resources - [What is GEO](/resources/geo-vs-seo): How Generative Engine Optimisation differs from classic SEO and why both are now required. - [Core Web Vitals 2026](/resources/core-web-vitals-2026): The three metrics Google ranks on and how to hit 90+ on mobile. - [Schema.org for marketing sites](/resources/schema-org-marketing): Minimum viable structured data graph for a professional services site. ## Optional - [llms.txt](/llms.txt): This file - [Sitemap](/sitemap.xml): Full site index
llms-full.txt: the extended variant
The specification also defines an llms-full.txt variant that includes the full text of key pages rather than just links. This is particularly useful for documentation-heavy sites where AI systems frequently need the full content of a reference page (API docs, technical specs, policy documents). For most marketing sites, the standard llms.txt is sufficient — the AI crawler will follow the links and retrieve the content itself.
What llms.txt does not do
Adding an llms.txt file does not guarantee your content will be cited. It is a discoverability signal, not a ranking guarantee. The quality, specificity, and authority of the content on the pages you list is what determines citation frequency. An llms.txt pointing to thin, vague content will not move the needle. Think of it as the index at the front of a textbook — it only helps if the chapters are worth reading.
It also does not prevent AI training on your content. To opt out of AI training crawls, you still need to use robots.txt directives targeting specific user agents (e.g., User-agent: CCBot
Disallow: / to block Common Crawl). These are separate mechanisms with separate purposes.
Sources
- 1. Howard, J. (2024). "A proposed standard for /llms.txt files." llmstxt.org
- 2. Cloudflare Developers (2024). "Using llms.txt with Cloudflare AI Gateway." Cloudflare Documentation.
- 3. OpenAI (2024). "GPTBot and OAI-SearchBot." OpenAI Platform Documentation.
- 4. Anthropic (2024). "ClaudeBot and robots.txt / llms.txt support." Anthropic Help Center.