Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 29 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,30 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

## [0.4.0] - 2026-07-02

### Added

#### Generate Commands

- `generate llms` - Generate a per-project `llms.txt` file for progressive disclosure
- Enumerates each project's current-version (and non-versioned) pages
- Extracts the page title (H1) and `meta` `:description:` for each page
- Resolves the production URL with `.md` appended
- Writes `<output-dir>/<project>/llms.txt` and prints a per-project
character-count summary (with and without descriptions), flagging files
over the 50,000-character `llms.txt` guideline
- Root landing pages use the `<root>/index.md` markdown form
- Resolves snooty `{+name+}` substitutions in titles and descriptions from
the project's `snooty.toml` `[constants]`
- Excludes `includes/` and `code-examples/` directories and the deprecated
`app-services` and `realm` projects
- Flags:
- `--output-dir` - Directory to write files into (default: `llms-output`)
- `--for-project` - Limit generation to a single project
- `--no-descriptions` - Omit `meta` descriptions from the written files
- `--base-url` - Override the default base URL (default: `https://www.mongodb.com/docs`)

#### Resolve Commands

- `resolve url` - Resolve documentation source files to production URLs
Expand All @@ -21,7 +43,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Flags:
- `--base-url` - Override the default base URL (default: `https://www.mongodb.com/docs`)

## [0.3.0] - 2025-01-07
#### Internal Packages

- `internal/rst/meta_parser.go` - Extract the `:description:` field from a page's `.. meta::` directive
- `internal/rst/page_title.go` - Extract a page's H1 title (underline-only and overline+underline styles)
- `internal/snooty` - Parse `[constants]` and resolve `{+name+}` substitutions (`ResolveSubstitutions`)

## [0.3.0] - 2026-01-07

### Added

Expand Down
58 changes: 58 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ A Go CLI tool for performing audit-related tasks in the MongoDB documentation mo
- [Count Commands](#count-commands)
- [Report Commands](#report-commands)
- [Resolve Commands](#resolve-commands)
- [Generate Commands](#generate-commands)
- [Development](#development)
- [Project Structure](#project-structure)
- [Adding New Commands](#adding-new-commands)
Expand Down Expand Up @@ -1832,6 +1833,63 @@ The command supports all projects defined in the documentation monorepo's table-
- Connectors (Kafka, Spark, BI Connector)
- And many more

### Generate Commands

#### `generate llms`

Generate a per-project `llms.txt` file for every documentation project.

This command supports an [`llms.txt`](https://llmstxt.org/) progressive-disclosure setup: a master `llms.txt` acts like a sitemap that links to each project's own `llms.txt`, which in turn lists that project's pages. For each project, this command enumerates the pages of its **current version** (and non-versioned projects), extracts each page's title and `meta` description, resolves its production URL (with `.md` appended), and writes the project's `llms.txt`.

Each line follows the standard format:

```
- [Page Title](https://www.mongodb.com/docs/manual/core/document.md): Definition, structure, and limitations of documents in MongoDB.
```

**Behavior details:**

- **Version scope:** Only the current version of each project is included, plus projects that are not versioned. Older versions and `upcoming` are skipped.
- **Root landing pages:** A project's root landing page has no `<root>.md` markdown form; its markdown lives at `<root>/index.md`, so that form is emitted. Nested section index pages resolve to the normal `<section>.md` form.
- **Missing descriptions:** Pages without a `meta` `:description:` are emitted without the trailing `: description`.
- **Substitutions:** Snooty constant references (`{+name+}`) in titles and descriptions are resolved from the project's `snooty.toml` `[constants]`.
- **Excluded content:** Partial (`includes/`) and `code-examples/` directories are skipped since they are not standalone pages. The deprecated `app-services` and `realm` projects, along with non-project directories (`404`, `docs-platform`, `meta`, `table-of-contents`), are excluded.
- **Character-count summary:** After writing the files, the command prints a per-project table showing the character count both **with** and **without** descriptions, flagging any file that exceeds the 50,000-character `llms.txt` guideline. This helps decide whether descriptions fit for larger projects.

**Basic Usage:**

```bash
# Generate llms.txt for all projects (uses the configured monorepo path)
./audit-cli generate llms
# Writes files to ./llms-output/<project>/llms.txt and prints a summary

# Generate for a single project
./audit-cli generate llms --for-project atlas

# Omit descriptions (useful for oversized projects or while iterating on docs)
./audit-cli generate llms --for-project cloud-docs --no-descriptions

# Point at a specific monorepo and output directory
./audit-cli generate llms /path/to/docs-mongodb-internal --output-dir build/llms
```

**Flags:**

- `--output-dir <dir>` - Directory to write per-project `llms.txt` files into (default: `llms-output`)
- `--for-project <name>` - Limit generation to a single project (content directory name)
- `--no-descriptions` - Omit `meta` descriptions from the written files
- `--base-url <url>` - Base URL for production documentation (default: `https://www.mongodb.com/docs`)

**Output layout:**

```
llms-output/
atlas/llms.txt
manual/llms.txt
node/llms.txt
...
```

## Development

### Project Structure
Expand Down
30 changes: 30 additions & 0 deletions commands/generate/generate.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
// Package generate provides the parent command for generating documentation artifacts.
//
// This package serves as the parent command for generation operations.
// Currently supports:
// - llms: Generate per-project llms.txt files
package generate

import (
"github.com/grove-platform/audit-cli/commands/generate/llms"
"github.com/spf13/cobra"
)

// NewGenerateCommand creates the generate parent command.
//
// This command serves as a parent for various generation operations.
// It doesn't perform any operations itself but provides a namespace for subcommands.
func NewGenerateCommand() *cobra.Command {
cmd := &cobra.Command{
Use: "generate",
Short: "Generate documentation artifacts",
Long: `Generate artifacts derived from the documentation monorepo.

Currently supports:
- llms: Generate per-project llms.txt files for progressive disclosure`,
}

cmd.AddCommand(llms.NewLLMSCommand())

return cmd
}
Loading
Loading