logo

Is You Website Optimized For LLMs?

2026-05-22

Making Your Website Visible to AI: What Actually Works, and What Is Just Noise

Search is shrinking and AI assistants are becoming the front door to the web. Here is what genuinely makes your site visible to LLMs, the code to ship it, and what is just expensive hype.

The way people discover companies just changed, and most websites have not caught up.

When a founder wants a development studio, more of them open ChatGPT or Claude before they ever open Google. When a shopper wants the best cold brew ratio, they ask Perplexity. The AI reads the web, decides who is credible, and names two or three brands in its answer. If your name is in that answer, you just received the warmest introduction on earth. If it is not, you do not even know the conversation happened.

For thirty years, being found meant ranking on a page of blue links. That era is closing. Search volume is shrinking while AI assistants quietly become the front door to the web, and they do not hand out ten options. They hand out a short list, and they decide who makes it.

So the question every founder should be asking right now is simple. Can the robot find you, and when it does, does it understand you well enough to recommend you?

We have been living inside that question, for our own studio and for our clients, and we want to give you the honest version. Not the hype. The tested version, with the code to ship it.

First, a confession about the hype

Most of what gets published under the banner of "AI SEO" is guesswork wearing a strategy costume. Someone invents a meta tag, writes a blog post about it, and forty other blog posts cite that first post as proof. Nobody ever checks whether a single AI system actually reads the thing.

We are not going to do that to you. Everything below is either backed by a real experiment with real server logs, or we tell you plainly that it is a reasonable bet on the future. We picked up a lot of this discipline from a sharp piece by the engineering team at Evil Martians, Making your site visible to LLMs, which tested fourteen techniques on their own site and found that only six earned their keep. Read it. It is excellent. Consider this article the build it yourself companion to it.

What actually moves the needle

The whole game comes down to one idea. Large language models understand clean, cleanly structured text better than anything else. Every good move you can make is some version of getting your best content to an AI in the clearest possible form.

Three principles carry the weight. The rest is plumbing.

Principle one: do not lock the front door

This is step zero, it is boring, and it is the one that quietly bites people. Plenty of sites disallow crawlers like GPTBot and ClaudeBot in their robots.txt file without ever realizing it, and Cloudflare has been known to block AI bots by default on newer domains. If the door is locked, nothing else on this list matters.

Here is a robots.txt that welcomes the crawlers you want and tells them what they may do with what they find:

# robots.txt: allow AI assistants to read and cite you

User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

# Content-Signal is an emerging Cloudflare convention.
# search   = may appear in search results
# ai-input = may be used as live context for an AI answer
# ai-train = may be used to train a model
User-agent: *
Content-Signal: search=yes, ai-input=yes, ai-train=yes
Sitemap: https://yourdomain.com/sitemap.xml

Set those signals to match your own policy. If you do not want your content used for model training, write ai-train=no and keep the other two as yes. The point is to make that decision on purpose instead of by accident.

Principle two: write content a model can trust

This is the principle with the most research behind it, and it is the one most teams skip because it is real work rather than a config change.

The foundational study on the subject came out of Princeton and IIT Delhi and was published at KDD 2024. The researchers tested nine content strategies across ten thousand queries, and the findings were not subtle:

Citing authoritative sources improved visibility for previously low ranked content by roughly 115 percent. Adding relevant statistics lifted it by about 33 percent. Including direct quotations raised it by about 43 percent.

Source: GEO: Generative Engine Optimization, Princeton and IIT Delhi, KDD 2024

Every strategy that worked was about enriching the visible words on the page. Substance, evidence, specifics. Vague marketing copy does not get cited. A page that states something concrete and backs it up does.

In practice that means writing pages that lead with a clear answer, then support it. Compare these two openings for the same page:

<!-- Weak: vague, unsourced, nothing for an AI to extract -->
## Our Approach to Web Performance

We are passionate about building fast, modern websites that
delight users and help your business grow.
<!-- Strong: definition first, concrete, sourced -->
## Web Performance

A site should reach interactive in under 2.5 seconds on a
mid range mobile device. Google found that the probability of
a bounce rises 32 percent as page load time goes from 1 to 3
seconds (Google / SOASTA research, 2017). We treat 2.5
seconds as a hard budget, not a goal.

The second version is the one an AI can lift cleanly into an answer and attribute to you. Write every important page that way.

Principle three: hand the AI a clean copy

A typical web page is mostly navigation, scripts, and clutter. The real content might be a fifth of what loads. The emerging convention is to also serve a plain text Markdown version of your important pages, and to keep a small file at your site root called llms.txt that acts as a curated map of your best content. Think of it as a README written for an AI instead of for a developer.

The format, from the llmstxt.org specification, is intentionally minimal: an H1 with your name, a blockquote summary, then H2 sections of annotated links.

# Cause of a Kind

> Cause of a Kind is a full stack product development and creative
> studio in New York. We help cool people build great products.

## Services

- [Product Development](/services/product): Idea to launched software
- [Fractional Leadership](/services/fractionals): Chief level expertise on demand
- [Creative and Marketing](/services/marketing): Brand, content, and SEO

## Writing

- [Making your site visible to AI](/blog/visible-to-ai): This article, in clean Markdown
- [Field notes on bootstrapping](/blog): What we learn building in public

## Contact

- [Start a project](/contact): How to work with us

That is the whole file. Five minutes to write. Save it at yourdomain.com/llms.txt and you have handed any AI a clean map of who you are.

The technical layer: copy, paste, ship

The llms.txt file points at clean content, so the next job is making sure that clean content actually exists and is easy to find. Three small pieces of plumbing do that, and none takes more than an hour.

Serve a Markdown twin of every page. The convention is the same URL with .md appended, so /blog/my-post has a twin at /blog/my-post.md. One route handler covers both the URL suffix and the Accept: text/markdown header that coding assistants like Claude Code and Cursor already send:

// One handler: Markdown when asked for, HTML otherwise
app.get('/blog/:slug', async (req, res) => {
  const isMarkdownUrl = req.params.slug.endsWith('.md');
  const slug = req.params.slug.replace(/\.md$/, '');

  const post = await getPost(slug);
  if (!post) return res.status(404).send('Not found');

  const accept = req.headers.accept || '';
  const wantsMarkdown = isMarkdownUrl || accept.includes('text/markdown');

  // Vary tells CDNs to cache the two formats separately
  res.set('Vary', 'Accept');

  if (wantsMarkdown) {
    res.set('Content-Type', 'text/markdown; charset=utf-8');
    return res.send(post.markdown);
  }
  return res.send(renderHtml(post));
});

Same URL, same content, different format, declared with Vary: Accept. That is ordinary HTTP doing what it has always done, not cloaking. One warning: if you maintain HTML and Markdown separately they will drift, so generate one from the other or serve from a single source.

Advertise the Markdown version. Add one tag to the <head> of every page so any crawler reading your HTML knows the clean copy exists:

<link rel="alternate" type="text/markdown" href="/blog/my-post.md" />

Then send the same fact in an HTTP header, so agents that never parse your HTML body still see it:

Link: </blog/my-post.md>; rel="alternate"; type="text/markdown"

Leave a plain language hint for pasted URLs. When someone pastes your URL straight into ChatGPT, the model reads rendered page text. A visually hidden note tells it where the clean copy lives:

<div class="sr-only" aria-hidden="true">
  A Markdown version of this page is available at
  https://yourdomain.com/blog/my-post.md, optimized for AI tools.
</div>
.sr-only {
  position: absolute;
  width: 1px;
  height: 1px;
  margin: -1px;
  padding: 0;
  overflow: hidden;
  clip-path: inset(50%);
  white-space: nowrap;
}

Use aria-hidden="true" so screen readers skip it. This message is for machines, not for assistive technology.

What the experiments actually say about llms.txt

We like llms.txt as good housekeeping. We do not like how it is being sold.

Two independent experiments in early 2026 put it to the test. The agency Reboot published llms.txt files on two domains, set up so that the only way an AI bot could discover certain test pages was through that file, then watched the logs. Three months later, no AI bots had visited the llms.txt files at all, even while those same bots happily crawled other pages on the same sites. Their writeup is here: LLMs.txt GEO experiment.

OtterlyAI ran a ninety day study and reached the same place:

Only about a tenth of one percent of AI crawler requests touched llms.txt at all. The file received far fewer AI visits than an average content page. llms.txt is infrastructure for AI integrations, not a ranking factor for AI search.

Source: The llms.txt experiment, OtterlyAI, 2026

For balance, read the other side too. Mintlify, which auto generates these files for its customers, makes the optimistic case and points to Vercel reporting that ten percent of its signups now arrive from ChatGPT: The value of llms.txt, hype or real.

So where does that leave you? Ship the llms.txt file. It takes five minutes, it costs nothing, and it genuinely helps in the situation that happens constantly, where a human pastes your URL into ChatGPT and the tool follows your links. Just do not believe anyone who tells you it is the whole strategy. It is a polite handshake, not a growth engine.

The stuff to skip

Save your money on these. There is no evidence behind them.

Dedicated "AI info" pages, mystery meta tags, hidden HTML comments aimed at bots, and human versus AI toggle buttons. AI agents do not click buttons, most parsers strip comments before they ever read a page, and a well written normal page already does everything a special "AI page" claims to do. Even structured data, the schema markup that SEO teams have leaned on for years, has been shown in controlled tests to be mostly invisible to ChatGPT, Claude, and Perplexity, which treat it as plain text on a page. Keep your schema, it still helps classic Google. Just do not expect it to win you AI citations on its own.

The part nobody wants to hear: you have to measure

Here is the uncomfortable truth. You do not actually know if any of this is working unless you watch the logs.

Traditional analytics will fail you here, because AI crawlers do not run JavaScript. You need server side request logs that capture the raw user agent, so you can see GPTBot, ClaudeBot, and PerplexityBot arriving, plus referrer data so you can catch real humans landing on your site from chatgpt.com or claude.ai. A one line classifier in your request logging is enough to start:

// Tag AI crawler and AI referral traffic in your access logs
function classifyTraffic(req) {
  const ua = (req.headers['user-agent'] || '').toLowerCase();
  const ref = (req.headers.referer || '').toLowerCase();

  const aiCrawler = /gptbot|claudebot|perplexitybot|google-extended/.test(ua);
  const aiReferral = /chatgpt\.com|claude\.ai|perplexity\.ai/.test(ref);

  return { aiCrawler, aiReferral, path: req.path };
}

Log that, watch it weekly, and within a month you will know whether the crawlers are arriving and whether humans are landing on your site from AI answers.

You should also measure the output, not just the input. Search Engine Land published a clean playbook for this: pick five to ten queries that genuinely matter to your business, ask the major AIs those queries on a schedule, and log whether your brand showed up. Snapshots over time tell you the truth. Their guide is here: How to measure brand visibility in AI search, and their broader argument for why measurement is the foundation of this entire discipline is worth your time too: LLM optimization in 2026.

One last piece of grounding from people we trust. Search Engine Land also ran the numbers on the wider tactic landscape in 12 proven LLM visibility tactics, and the headline from Google belongs on every founder's wall:

Google's John Mueller has put it plainly: there is no GEO or AEO without SEO fundamentals. Tricks come out, they work for a short time, and companies that intend to last should bet on what is proven and stable instead.

Reported by Search Engine Land, 12 proven LLM visibility tactics, 2026

Your prescription for success

If you do nothing else, do these, in this order. Each line is a real task, not a vibe.

  1. Audit robots.txt this week. Confirm GPTBot, ClaudeBot, and PerplexityBot are allowed, and add a Content-Signal line that matches your policy. Ten minutes.
  2. Ship llms.txt today. One static Markdown file at your site root. Five minutes, zero risk.
  3. Rewrite your five most important pages. Lead with a clear answer, add a real statistic, cite a real source. This is the work the research says actually moves citations, and it is the step most teams skip.
  4. Serve .md twins with content negotiation. Same URL with .md appended, honoring Accept: text/markdown, with Vary: Accept set.
  5. Advertise the Markdown. Add the <link rel="alternate"> tag and the HTTP Link header. One template change, one middleware line.
  6. Add the visually hidden hint. One small component for the pasted URL case.
  7. Instrument your logs. Tag AI crawler and AI referral traffic so you can see what is real rather than guessing.
  8. Run the public scanners. Check your work at acceptmarkdown.com and isitagentready.com until they pass.

Steps one through three deliver most of the value and most teams can finish them in a single afternoon. Steps four through eight are how a real engineering team finishes the job properly.

The COAK take

Strip away the acronyms and here is what is really going on. The web got a new reader. It is patient, it is literal, it has zero tolerance for clutter, and it forms opinions about who is trustworthy. Your job is not to trick it. Your job is to be genuinely legible and genuinely credible, which, funny enough, is the same job you always had.

Be the brand that says something concrete. Be the page that backs a claim with a number and a source. Be the site that loads clean and does not bury its best thinking under scripts and popups. Do that, and you become the name the AI is comfortable handing to a stranger who needs exactly what you do.

That is the whole thing. Be useful, be clear, be trustworthy, and the introductions take care of themselves. The principles have not changed. The audience has, and this part of it now talks to millions of people every single day.

If you want a partner who treats this as engineering rather than fortune telling, that is what we do. Cause of a Kind is full stack, full service, on shore and in house. We help cool people build great products, and lately that includes making sure the robots know your name.

Forward to Extraordinary.

Cause of a Kind
causeofakind.com

Book a Systems Audit