Skip to content

fix(docs): emit .md URLs in AI indexes + build-time cross-link validator#276

Merged
tonytlwu merged 1 commit into
masterfrom
fix/ai-docs-md-urls
Jun 7, 2026
Merged

fix(docs): emit .md URLs in AI indexes + build-time cross-link validator#276
tonytlwu merged 1 commit into
masterfrom
fix/ai-docs-md-urls

Conversation

@tonytlwu

@tonytlwu tonytlwu commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Summary

  • All AI discovery surfaces (llms.txt, llms-full.txt, agent-skills SKILL.md files, V3 library catalog docUrl) now emit raw .md URLs instead of .html. This makes Studio's fetch_fliplet_doc short-circuit work immediately — web_fetch refuses non-Google HTML, so every .html URL was guaranteed to waste a round-trip before falling back to .md anyway.
  • Adds validateCrossLinks() to bin/build-agent-indexes.mjs — a fence-aware build-time link rot detector wired into --strict mode. Resolves ../, ./, /-rooted, bare, and .md/.html/extensionless links; skips external hosts and asset extensions. Escape hatches: <!-- lint-ignore-link --> on the offending link, or LINK_ALLOWLIST for known exceptions.
  • Author docs updated in CONTRIBUTING.md ("Internal links" section) and CLAUDE.md (build pipeline note).

Context

Part of a two-lever fix for Studio V3 doc-fetch inefficiency (see companion Studio PR). Lever 2 here: once deployed to CF Pages, the index will serve .md URLs so the short-circuit fires even for new builder sessions that haven't loaded updated Studio code yet.

Root cause: apps-ai-web-fetch.js:388 returns USE_WEB_SEARCH for all non-Google HTML — a server-side design decision. Every .html doc URL from llms.txt was hitting this wall, costing a wasted fetch + fallback on every single doc lookup.

Verification: 157/157 tests pass (npm run test:unit). Regenerated .well-known/* artifacts have 0 .html entries.

Test plan

  • cd docs && npm run test:unit — 157/157 pass
  • node bin/build-agent-indexes.mjs — no errors, .well-known/llms.txt contains only .md URLs
  • node bin/build-agent-indexes.mjs --strict — passes on clean tree
  • CF Pages preview URL: verify /.well-known/llms.txt serves .md entries

🤖 Generated with Claude Code

All AI discovery surfaces (llms.txt, llms-full.txt, agent-skills SKILL.md
files, V3 library catalog docUrl) now emit raw .md URLs instead of .html.
This lets Studio's fetch_fliplet_doc short-circuit the doomed .html round-
trip: web_fetch refuses non-Google HTML (USE_WEB_SEARCH), so every .html
doc URL previously wasted a round-trip before falling back to .md anyway.

Also adds validateCrossLinks() to bin/build-agent-indexes.mjs — a
fence-aware build-time link rot detector wired into --strict mode. Resolves
../, ./, /-rooted, bare, and .md/.html/extensionless links; skips external
hosts and asset extensions. Escape hatches: <!-- lint-ignore-link --> on
the offending link, or LINK_ALLOWLIST for known exceptions. 157/157 tests
pass. Author docs updated in CONTRIBUTING.md and CLAUDE.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@cloudflare-workers-and-pages

Copy link
Copy Markdown

Deploying fliplet-cli with  Cloudflare Pages  Cloudflare Pages

Latest commit: 3fba541
Status: ✅  Deploy successful!
Preview URL: https://9cf82b88.fliplet-cli.pages.dev
Branch Preview URL: https://fix-ai-docs-md-urls.fliplet-cli.pages.dev

View logs

@tonytlwu tonytlwu merged commit 7421d36 into master Jun 7, 2026
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant