feat: add AI discoverability layer (llms.txt, skill.md, MCP server) by avikalpg · Pull Request #17 · WildestAI/DiffGraph-CLI

avikalpg · 2026-06-11T20:49:31Z

What

Adds the full AI discoverability infrastructure for WildestAI, without any third-party SaaS dependency.

Files added

File	Purpose
`llms.txt`	Standard AI discovery file at repo root — compact context for LLMs crawling/cloning the repo
`skill.md`	Agent skill file — tells AI agents exactly how to install and use `wild` CLI
`mcp_server.py`	MCP server exposing `run_wild_diff`, `list_docs`, `get_docs`, `search_docs` tools

MCP server tools

run_wild_diff(repo_path, args) — runs wild diff on any repo and returns structured results
list_docs() — indexes all 9 documentation pages
get_docs(name) — fetches any doc by slug
search_docs(query) — full-text search across all docs

Also exposes a wildestai://llms.txt MCP resource.

Why

Replaces the need for DocsALot ($100/month) — same AI discoverability features built in-house.

Testing

python3 -c "import mcp_server; print('imports OK')"
python3 -c "from mcp_server import list_docs, search_docs; print(list_docs())"

Summary by CodeRabbit

New Features
- Added standardized JSON schema for DiffGraph v2.0 output format with strict validation rules
- Launched MCP server exposing tools for running diff analysis with configurable output, searching documentation, and retrieving docs
Documentation
- Added comprehensive getting started guide covering installation, configuration, and CLI usage examples
- Added LLM context documentation file for AI integration

Adds diffgraph/schema/diffgraph-v2.schema.json — the JSON Schema 2020-12 draft that operationalises the v2 output contract from design/JSON-SCHEMA.md. Covers: FileEntry, SymbolEntry, RelationshipEntry, SummaryEntry, Evidence, Metadata, Warning, AnalysisSource. Required fields enforced; inferred claims must carry evidence + confidence. privacy_tier is a top-level required metadata field. Consumers can use this for validation in CI, typed generation, and VS Code schema hints. This satisfies one of the four schema ratification criteria in JSON-SCHEMA.md (the machine-readable file). Still needs: Avikalp sign-off on sub-questions, one end-to-end worked example validated, PR #11 updated to target this schema.

- llms.txt at repo root: compact context for LLMs crawling the repo - skill.md: agent skill file so AI agents know how to use wild CLI - mcp_server.py: MCP server exposing run_wild_diff, list_docs, get_docs, search_docs tools and a wildestai://llms.txt resource Replaces the need for a paid DocsALot subscription — all AI discoverability infrastructure built in-house.

coderabbitai · 2026-06-11T20:49:59Z

Walkthrough

This PR establishes the complete foundation for DiffGraph v2.0: a canonical JSON Schema specifying the artifact structure, an MCP server implementing programmatic access to diff analysis and documentation, and user guides for both CLI and agent-based usage.

Changes

DiffGraph v2.0 Infrastructure

Layer / File(s)	Summary
DiffGraph v2.0 Canonical Schema `diffgraph/schema/diffgraph-v2.schema.json`	Defines the JSON Schema for DiffGraph v2.0 artifacts, specifying top-level required fields (schema_version, generated_at, wild_version, diff_ref, files, symbols, relationships, metadata), shared type definitions (AnalysisSource, Evidence), and entity structures (FileEntry, SymbolEntry, RelationshipEntry, SummaryEntry) with conditional validation rules requiring evidence when analysis_source is inferred.
MCP Server Implementation `mcp_server.py`	Implements FastMCP server exposing run_wild_diff tool (validates repo, executes wild diff subprocess with optional output, enforces 120s timeout), list_docs tool (aggregates markdown pages from slug set and docs directory), get_docs tool (resolves document names via slug map or file path), search_docs tool (case-insensitive substring search with line context), and llms_txt resource (embedded or read from website directory).
User and Agent Documentation `llms.txt`, `skill.md`	Describes project purpose (wild diff wraps git for AI-powered semantic analysis), installation/quick start, available MCP tools, CLI command variants, output behavior (diffgraph.html generation), environment configuration (OPENAI_API_KEY), and usage notes for scoping diffs and performance considerations.

Sequence Diagram

sequenceDiagram
  participant Client
  participant MCPServer
  participant FileSystem
  participant WildDiff as wild diff<br/>Subprocess
  Client->>MCPServer: run_wild_diff(repo_path, args)
  activate MCPServer
  MCPServer->>FileSystem: validate .git directory exists
  MCPServer->>WildDiff: execute wild diff --no-open
  activate WildDiff
  WildDiff->>FileSystem: generate diffgraph.html or custom output
  WildDiff-->>MCPServer: stdout/stderr, return code
  deactivate WildDiff
  MCPServer-->>Client: success, returncode, output_path
  deactivate MCPServer
  Client->>MCPServer: get_docs(name)
  activate MCPServer
  MCPServer->>FileSystem: resolve via slug map or file path
  MCPServer-->>Client: document content
  deactivate MCPServer
  Client->>MCPServer: search_docs(query)
  activate MCPServer
  MCPServer->>FileSystem: scan markdown files, search content
  MCPServer-->>Client: matches with line context
  deactivate MCPServer

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: adding AI discoverability infrastructure (llms.txt, skill.md, MCP server) that replaces paid SaaS tooling.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch nia/ai-discoverability

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 8

🧹 Nitpick comments (2)

mcp_server.py (2)
131-137: 💤 Low value

External path references may fail in standalone deployments.

Lines 131-132 reference ../../wildestai/docs/DiffGraph-CLI/ which assumes a specific directory structure outside this repository. If the MCP server is deployed standalone or the monorepo structure differs, these docs will silently be unavailable.

Consider making external doc paths configurable via environment variables, or documenting the expected directory layout.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mcp_server.py` around lines 131 - 137, The built-in pages list uses hardcoded
external relative paths (built_in entries and the loop that resolves full via
REPO_ROOT and full.is_relative_to) which will break in standalone deployments;
update the code to make these doc paths configurable (e.g., read from
environment variables or a config dict) and fall back to bundled/internal copies
if the external path does not exist, by replacing the hardcoded rel_path values
with configurable keys and checking env/config before resolving via REPO_ROOT;
ensure you update the path-resolution logic around REPO_ROOT, full.exists(), and
full.is_relative_to to prefer configured paths and log a warning if unavailable
so the server can run standalone.
46-50: 💤 Low value

Function signature includes output_file not documented in skill.md.

The documented contract in skill.md (line 69) specifies run_wild_diff(repo_path, args), but the implementation adds an output_file parameter. While the parameter has a default value making it backward compatible, the documentation should be updated to reflect the actual interface.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mcp_server.py` around lines 46 - 50, The implementation of run_wild_diff now
accepts an extra parameter output_file (see function run_wild_diff(repo_path:
str, args: str = "", output_file: str = "") in mcp_server.py) but skill.md still
documents run_wild_diff(repo_path, args); update skill.md to include the new
optional output_file parameter and its behavior (default value, effect when
provided) to match the actual function signature, or alternatively remove
output_file from the function if the documented contract must be
preserved—ensure the documentation and the run_wild_diff signature are
consistent.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@diffgraph/schema/diffgraph-v2.schema.json`:
- Around line 101-139: Add "additionalProperties": false to the Evidence object
and change the line number minima to match 1-indexing: set "minimum": 1 on the
"line_start" and "line_end" properties in the Evidence schema; update the
Evidence properties block (the "Evidence" object definition) to include
additionalProperties: false and change the "minimum" values for "line_start" and
"line_end" from 0 to 1 so validation is strict and consistent with the
description.
- Around line 341-366: SummaryEntry currently allows omission of the evidence
field even though analysis_source is const "inferred" and the description
requires evidence; update the SummaryEntry schema so "evidence" is included in
the required array (alongside "text" and "analysis_source") and ensure the
existing "evidence" property continues to reference "`#/`$defs/Evidence" so
inferred summaries must supply evidence entries (including at least one
llm_inference and one structural_basis per the Evidence definition).

In `@llms.txt`:
- Around line 35-37: Replace the external links in llms.txt: keep the GitHub
entry ("https://github.com/WildestAI/DiffGraph-CLI") as-is, change the Website
entry from "https://wildest.ai" to the official "https://wildestai.com", and
update the "Full context" URL from "https://wildest.ai/llms-full.txt" to the
repository's full-context resource (use the repo raw file URL, e.g.
"https://raw.githubusercontent.com/WildestAI/DiffGraph-CLI/main/llms-full.txt")
so all three links point to the correct official resources.

In `@mcp_server.py`:
- Around line 204-208: Replace the broad except Exception in get_docs and
search_docs with targeted exception handlers: catch OSError for filesystem/read
errors and UnicodeDecodeError for encoding issues when calling
target.read_text(), and use the module logger (or processLogger) to log the full
exception details (including stack/str) before returning the error payload; keep
the returned structure the same but populate "error" with the logged exception
message for easier debugging.
- Around line 76-80: The code appends unsanitized args to cmd (variable cmd)
allowing CLI injection; implement a sanitize_args(args: str) helper that
enforces an allowlist (e.g., ALLOWED_FLAGS and ALLOWED_PREFIXES) and rejects
unknown flags, then replace the direct args.split() usage with cmd +=
sanitize_args(args). Ensure sanitize_args disallows bare "--output" (or only
permits "--output=" form if desired) so the explicit output_file handling cannot
be overridden, and raise/return an error for disallowed parts instead of
appending them.
- Around line 185-194: The code builds filesystem paths from the user-controlled
variable name (used in candidate/candidate2) without ensuring the resolved path
remains inside allowed roots (REPO_ROOT or DOCS_DIR), enabling path traversal;
fix by resolving the candidate paths and explicitly checking that each resolved
path is a descendant of its allowed base before assigning to target — e.g.,
after computing candidate = (REPO_ROOT / name).resolve() verify
candidate.is_relative_to(REPO_ROOT.resolve()) (or fall back to comparing string
prefixes of resolved paths) and only accept it if true, and do the same for
candidate2 against DOCS_DIR; if neither check passes, reject the request or
return an error.
- Around line 77-78: The code appends user-supplied output_file directly to cmd,
allowing path traversal; validate and constrain output_file to a designated
output directory (or the repository root) before adding "--output". Implement:
define a base output dir (e.g., OUTPUT_DIR or use the existing repo path
variable), join output_file with that base using os.path.join, resolve with
os.path.abspath/os.path.realpath, and verify with os.path.commonpath that the
resolved path is inside the base; if not, raise/return an error. Ensure the code
in the block that builds cmd (where output_file is checked) replaces the raw
value with the sanitized/resolved path and creates parent directories
(os.makedirs(..., exist_ok=True)) before appending to cmd.

In `@skill.md`:
- Around line 74-86: skill.md currently states "Works with Python 3.8+" but
setup.py's python_requires=">=3.7"; update the documentation to match the
package metadata by changing the text in skill.md from "Works with Python 3.8+"
to "Works with Python 3.7+" (or alternatively, if you intend to require 3.8+,
change setup.py's python_requires to ">=3.8"); locate the string in skill.md and
the python_requires in setup.py to keep both consistent.

---

Nitpick comments:
In `@mcp_server.py`:
- Around line 131-137: The built-in pages list uses hardcoded external relative
paths (built_in entries and the loop that resolves full via REPO_ROOT and
full.is_relative_to) which will break in standalone deployments; update the code
to make these doc paths configurable (e.g., read from environment variables or a
config dict) and fall back to bundled/internal copies if the external path does
not exist, by replacing the hardcoded rel_path values with configurable keys and
checking env/config before resolving via REPO_ROOT; ensure you update the
path-resolution logic around REPO_ROOT, full.exists(), and full.is_relative_to
to prefer configured paths and log a warning if unavailable so the server can
run standalone.
- Around line 46-50: The implementation of run_wild_diff now accepts an extra
parameter output_file (see function run_wild_diff(repo_path: str, args: str =
"", output_file: str = "") in mcp_server.py) but skill.md still documents
run_wild_diff(repo_path, args); update skill.md to include the new optional
output_file parameter and its behavior (default value, effect when provided) to
match the actual function signature, or alternatively remove output_file from
the function if the documented contract must be preserved—ensure the
documentation and the run_wild_diff signature are consistent.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c0f994be-218f-4c85-a648-569d1b929022

📥 Commits

Reviewing files that changed from the base of the PR and between a43abac and b703935.

📒 Files selected for processing (4)

diffgraph/schema/diffgraph-v2.schema.json
llms.txt
mcp_server.py
skill.md

coderabbitai · 2026-06-11T21:04:14Z

+    "Evidence": {
+      "type": "object",
+      "description": "Pointer to what produced a claim. kind determines which fields are present.",
+      "required": ["kind"],
+      "properties": {
+        "kind": {
+          "type": "string",
+          "enum": [
+            "git_diff_stat",
+            "git_diff_name_status",
+            "path_pattern",
+            "ast_parse",
+            "import_statement",
+            "call_site",
+            "llm_inference",
+            "structural_basis"
+          ]
+        },
+        "file": { "type": "string", "description": "Relevant for ast_parse, import_statement, call_site." },
+        "line_start": { "type": "integer", "minimum": 0, "description": "1-indexed line number." },
+        "line_end": { "type": "integer", "minimum": 0 },
+        "snippet": { "type": "string", "description": "Short source excerpt (signature line or import statement)." },
+        "pattern": { "type": "string", "description": "Glob/regex pattern (kind=path_pattern)." },
+        "detail": { "type": "string", "description": "Free-text detail (kind=git_diff_stat/name_status)." },
+        "model": { "type": "string", "description": "LLM model id (kind=llm_inference)." },
+        "prompt_ref": { "type": "string", "description": "Internal prompt template reference (kind=llm_inference)." },
+        "temperature": { "type": "number", "minimum": 0, "maximum": 2, "description": "(kind=llm_inference)." },
+        "symbol_ids": {
+          "type": "array",
+          "items": { "type": "string" },
+          "description": "Symbol IDs that grounded this inferred claim (kind=structural_basis)."
+        },
+        "file_ids": {
+          "type": "array",
+          "items": { "type": "string" },
+          "description": "File IDs that grounded this inferred claim (kind=structural_basis)."
+        }
+      }
+    },


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Evidence lacks additionalProperties: false and has inconsistent line number constraints.

Two issues in the Evidence definition:

Unlike all other type definitions in this schema, Evidence does not specify additionalProperties: false. This breaks the strict validation pattern and allows arbitrary extra fields.

line_start and line_end have minimum: 0, but the description states "1-indexed line number". For 1-indexed values, minimum should be 1.

Proposed fix

"Evidence": { "type": "object", "description": "Pointer to what produced a claim. kind determines which fields are present.", "required": ["kind"], + "additionalProperties": false, "properties": { "kind": { "type": "string", ... }, "file": { "type": "string", "description": "Relevant for ast_parse, import_statement, call_site." }, - "line_start": { "type": "integer", "minimum": 0, "description": "1-indexed line number." }, - "line_end": { "type": "integer", "minimum": 0 }, + "line_start": { "type": "integer", "minimum": 1, "description": "1-indexed line number." }, + "line_end": { "type": "integer", "minimum": 1 },

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"Evidence": {

"type": "object",

"description": "Pointer to what produced a claim. kind determines which fields are present.",

"required": ["kind"],

"properties": {

"kind": {

"type": "string",

"enum": [

"git_diff_stat",

"git_diff_name_status",

"path_pattern",

"ast_parse",

"import_statement",

"call_site",

"llm_inference",

"structural_basis"

]

},

"file": { "type": "string", "description": "Relevant for ast_parse, import_statement, call_site." },

"line_start": { "type": "integer", "minimum": 0, "description": "1-indexed line number." },

"line_end": { "type": "integer", "minimum": 0 },

"snippet": { "type": "string", "description": "Short source excerpt (signature line or import statement)." },

"pattern": { "type": "string", "description": "Glob/regex pattern (kind=path_pattern)." },

"detail": { "type": "string", "description": "Free-text detail (kind=git_diff_stat/name_status)." },

"model": { "type": "string", "description": "LLM model id (kind=llm_inference)." },

"prompt_ref": { "type": "string", "description": "Internal prompt template reference (kind=llm_inference)." },

"temperature": { "type": "number", "minimum": 0, "maximum": 2, "description": "(kind=llm_inference)." },

"symbol_ids": {

"type": "array",

"items": { "type": "string" },

"description": "Symbol IDs that grounded this inferred claim (kind=structural_basis)."

},

"file_ids": {

"type": "array",

"items": { "type": "string" },

"description": "File IDs that grounded this inferred claim (kind=structural_basis)."

}

}

},

"Evidence": {

"type": "object",

"description": "Pointer to what produced a claim. kind determines which fields are present.",

"required": ["kind"],

"additionalProperties": false,

"properties": {

"kind": {

"type": "string",

"enum": [

"git_diff_stat",

"git_diff_name_status",

"path_pattern",

"ast_parse",

"import_statement",

"call_site",

"llm_inference",

"structural_basis"

]

},

"file": { "type": "string", "description": "Relevant for ast_parse, import_statement, call_site." },

"line_start": { "type": "integer", "minimum": 1, "description": "1-indexed line number." },

"line_end": { "type": "integer", "minimum": 1 },

"snippet": { "type": "string", "description": "Short source excerpt (signature line or import statement)." },

"pattern": { "type": "string", "description": "Glob/regex pattern (kind=path_pattern)." },

"detail": { "type": "string", "description": "Free-text detail (kind=git_diff_stat/name_status)." },

"model": { "type": "string", "description": "LLM model id (kind=llm_inference)." },

"prompt_ref": { "type": "string", "description": "Internal prompt template reference (kind=llm_inference)." },

"temperature": { "type": "number", "minimum": 0, "maximum": 2, "description": "(kind=llm_inference)." },

"symbol_ids": {

"type": "array",

"items": { "type": "string" },

"description": "Symbol IDs that grounded this inferred claim (kind=structural_basis)."

},

"file_ids": {

"type": "array",

"items": { "type": "string" },

"description": "File IDs that grounded this inferred claim (kind=structural_basis)."

}

}

},

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@diffgraph/schema/diffgraph-v2.schema.json` around lines 101 - 139, Add "additionalProperties": false to the Evidence object and change the line number minima to match 1-indexing: set "minimum": 1 on the "line_start" and "line_end" properties in the Evidence schema; update the Evidence properties block (the "Evidence" object definition) to include additionalProperties: false and change the "minimum" values for "line_start" and "line_end" from 0 to 1 so validation is strict and consistent with the description.

coderabbitai · 2026-06-11T21:04:15Z

+    "SummaryEntry": {
+      "type": "object",
+      "required": ["text", "analysis_source"],
+      "additionalProperties": false,
+      "properties": {
+        "text": {
+          "type": "string",
+          "description": "Human-readable summary of the change."
+        },
+        "analysis_source": {
+          "type": "string",
+          "const": "inferred",
+          "description": "Summaries are always inferred (require LLM interpretation)."
+        },
+        "confidence": {
+          "type": ["number", "null"],
+          "minimum": 0,
+          "maximum": 1
+        },
+        "evidence": {
+          "type": "array",
+          "items": { "$ref": "#/$defs/Evidence" },
+          "description": "Must include at least one llm_inference entry and one structural_basis entry."
+        }
+      }
+    },


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

SummaryEntry does not enforce required evidence despite the description.

SummaryEntry has analysis_source as a const "inferred", meaning it's always inferred. Per the pattern established by SymbolEntry and RelationshipEntry, inferred claims must have evidence. The description on line 363 states evidence "Must include at least one llm_inference entry", but evidence is not in the required array, so the schema allows SummaryEntry without any evidence.

Proposed fix

"SummaryEntry": { "type": "object", - "required": ["text", "analysis_source"], + "required": ["text", "analysis_source", "evidence"], "additionalProperties": false, "properties": { ... "evidence": { "type": "array", "items": { "$ref": "`#/`$defs/Evidence" }, - "description": "Must include at least one llm_inference entry and one structural_basis entry." + "description": "Must include at least one llm_inference entry and one structural_basis entry.", + "minItems": 1 } } },

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"SummaryEntry": {

"type": "object",

"required": ["text", "analysis_source"],

"additionalProperties": false,

"properties": {

"text": {

"type": "string",

"description": "Human-readable summary of the change."

},

"analysis_source": {

"type": "string",

"const": "inferred",

"description": "Summaries are always inferred (require LLM interpretation)."

},

"confidence": {

"type": ["number", "null"],

"minimum": 0,

"maximum": 1

},

"evidence": {

"type": "array",

"items": { "$ref": "#/$defs/Evidence" },

"description": "Must include at least one llm_inference entry and one structural_basis entry."

}

}

},

"SummaryEntry": {

"type": "object",

"required": ["text", "analysis_source", "evidence"],

"additionalProperties": false,

"properties": {

"text": {

"type": "string",

"description": "Human-readable summary of the change."

},

"analysis_source": {

"type": "string",

"const": "inferred",

"description": "Summaries are always inferred (require LLM interpretation)."

},

"confidence": {

"type": ["number", "null"],

"minimum": 0,

"maximum": 1

},

"evidence": {

"type": "array",

"items": { "$ref": "`#/`$defs/Evidence" },

"description": "Must include at least one llm_inference entry and one structural_basis entry.",

"minItems": 1

}

}

},

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@diffgraph/schema/diffgraph-v2.schema.json` around lines 341 - 366, SummaryEntry currently allows omission of the evidence field even though analysis_source is const "inferred" and the description requires evidence; update the SummaryEntry schema so "evidence" is included in the required array (alongside "text" and "analysis_source") and ensure the existing "evidence" property continues to reference "`#/`$defs/Evidence" so inferred summaries must supply evidence entries (including at least one llm_inference and one structural_basis per the Evidence definition).

coderabbitai · 2026-06-11T21:04:15Z

+- GitHub: https://github.com/WildestAI/DiffGraph-CLI
+- Website: https://wildest.ai
+- Full context: https://wildest.ai/llms-full.txt


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Is the GitHub repository accessible at https://github.com/WildestAI/DiffGraph-CLI? Is the website accessible at https://wildest.ai? Does the endpoint https://wildest.ai/llms-full.txt exist and serve the full context for LLMs?

💡 Result:

Yes, the GitHub repository at https://github.com/WildestAI/DiffGraph-CLI is accessible [1][2][3]. Regarding the website https://wildest.ai, it is important to distinguish between potential naming collisions. The domain https://wildest.ai is not the official website for the entity "Wildest AI," which hosts the DiffGraph-CLI repository; the organization Wildest AI lists its website as https://wildestai.com [1]. The domain https://wilds.ai (plural) is a separate platform for AI games and characters [4][5]. The endpoint https://wildest.ai/llms-full.txt does not appear to exist or serve documentation for the Wildest AI organization. A file is accessible at https://wilds.ai/llms-full.txt, which serves content related to the "wilds.ai" platform [6], but this is a distinct entity from the Wildest AI organization associated with DiffGraph-CLI [1].

Citations:

1: https://github.com/WildestAI

2: https://github.com/WildestAI/DiffGraph-CLI/blob/main/.env.example

3: https://github.com/WildestAI/DiffGraph-CLI/issues

4: https://wilds.ai/

5: https://wilds.ai/about

6: https://wilds.ai/llms-full.txt

Fix external link URLs in llms.txt (lines 35–37).

GitHub link (https://github.com/WildestAI/DiffGraph-CLI) is reachable.

Website link should use the official Wildest AI domain: https://wildestai.com (the https://wildest.ai domain appears to be different/not official).

“Full context” link currently targets https://wildest.ai/llms-full.txt, which doesn’t appear to exist for the Wildest AI project—update it to the correct full-context URL for this repo.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@llms.txt` around lines 35 - 37, Replace the external links in llms.txt: keep the GitHub entry ("https://github.com/WildestAI/DiffGraph-CLI") as-is, change the Website entry from "https://wildest.ai" to the official "https://wildestai.com", and update the "Full context" URL from "https://wildest.ai/llms-full.txt" to the repository's full-context resource (use the repo raw file URL, e.g. "https://raw.githubusercontent.com/WildestAI/DiffGraph-CLI/main/llms-full.txt") so all three links point to the correct official resources.

coderabbitai · 2026-06-11T21:04:15Z

+    cmd = ["wild", "diff", "--no-open"]
+    if output_file:
+        cmd += ["--output", output_file]
+    if args:
+        cmd += args.split()


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Command argument injection via unsanitized args parameter.

The args parameter is split and appended directly to the subprocess command without validation. An attacker-controlled agent input could inject arbitrary CLI flags:

args="--output /etc/cron.d/malicious" could overwrite sensitive files (bypassing the output_file parameter entirely)

args="../../sensitive/repo" could manipulate path-based arguments

Consider implementing an allowlist of permitted argument patterns or parsing args to extract only known safe flags.

Proposed mitigation approach

# Define allowed arguments ALLOWED_FLAGS = {"--staged", "--no-open", "--json"} ALLOWED_PREFIXES = {"--output=", "--format="} def sanitize_args(args: str) -> list[str]: """Parse and validate args, rejecting unknown flags.""" if not args: return [] parts = args.split() sanitized = [] for part in parts: if part in ALLOWED_FLAGS: sanitized.append(part) elif any(part.startswith(p) for p in ALLOWED_PREFIXES): sanitized.append(part) elif not part.startswith("-"): # Likely a commit ref or path - validate further sanitized.append(part) else: raise ValueError(f"Disallowed argument: {part}") return sanitized

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@mcp_server.py` around lines 76 - 80, The code appends unsanitized args to cmd (variable cmd) allowing CLI injection; implement a sanitize_args(args: str) helper that enforces an allowlist (e.g., ALLOWED_FLAGS and ALLOWED_PREFIXES) and rejects unknown flags, then replace the direct args.split() usage with cmd += sanitize_args(args). Ensure sanitize_args disallows bare "--output" (or only permits "--output=" form if desired) so the explicit output_file handling cannot be overridden, and raise/return an error for disallowed parts instead of appending them.

Source: Linters/SAST tools

coderabbitai · 2026-06-11T21:04:15Z

+    if output_file:
+        cmd += ["--output", output_file]


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

output_file allows arbitrary filesystem write (path traversal).

The output_file parameter is passed directly to --output without path validation. A malicious input like output_file="/etc/cron.d/backdoor" could write to sensitive system locations.

Constrain output_file to be within the target repository or a designated output directory.

Proposed fix

+def _validate_output_path(output_file: str, repo: Path) -> Path: + """Ensure output_file is within repo or use default.""" + if not output_file: + return repo / "diffgraph.html" + out_path = Path(output_file).expanduser().resolve() + # Must be within the repo directory + if not out_path.is_relative_to(repo): + raise ValueError(f"output_file must be within repo: {repo}") + return out_path `@mcp.tool`() def run_wild_diff(...) -> dict: ... + try: + validated_output = _validate_output_path(output_file, repo) + except ValueError as e: + return {"success": False, "error": str(e)} + cmd = ["wild", "diff", "--no-open"] - if output_file: - cmd += ["--output", output_file] + cmd += ["--output", str(validated_output)]

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@mcp_server.py` around lines 77 - 78, The code appends user-supplied output_file directly to cmd, allowing path traversal; validate and constrain output_file to a designated output directory (or the repository root) before adding "--output". Implement: define a base output dir (e.g., OUTPUT_DIR or use the existing repo path variable), join output_file with that base using os.path.join, resolve with os.path.abspath/os.path.realpath, and verify with os.path.commonpath that the resolved path is inside the base; if not, raise/return an error. Ensure the code in the block that builds cmd (where output_file is checked) replaces the raw value with the sanitized/resolved path and creates parent directories (os.makedirs(..., exist_ok=True)) before appending to cmd.

coderabbitai · 2026-06-11T21:04:15Z

+    else:
+        # Try as relative path from repo root
+        candidate = (REPO_ROOT / name).resolve()
+        if candidate.exists():
+            target = candidate
+        else:
+            # Try docs subdir
+            candidate2 = (DOCS_DIR / name).resolve()
+            if candidate2.exists():
+                target = candidate2


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Path traversal vulnerability allows reading arbitrary files.

The name parameter is user-controlled and used directly to construct file paths without validating the resolved path stays within allowed directories. An input like name="../../../../etc/passwd" would resolve to /etc/passwd and be read.

Proposed fix

+def _is_safe_path(path: Path, allowed_roots: list[Path]) -> bool: + """Check if resolved path is within any allowed root.""" + resolved = path.resolve() + return any(resolved.is_relative_to(root.resolve()) for root in allowed_roots) `@mcp.tool`() def get_docs(name: str) -> dict: ... if name in slug_map: target = slug_map[name] else: # Try as relative path from repo root candidate = (REPO_ROOT / name).resolve() - if candidate.exists(): + if candidate.exists() and _is_safe_path(candidate, [REPO_ROOT]): target = candidate else: # Try docs subdir candidate2 = (DOCS_DIR / name).resolve() - if candidate2.exists(): + if candidate2.exists() and _is_safe_path(candidate2, [DOCS_DIR]): target = candidate2

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

else:

# Try as relative path from repo root

candidate = (REPO_ROOT / name).resolve()

if candidate.exists():

target = candidate

else:

# Try docs subdir

candidate2 = (DOCS_DIR / name).resolve()

if candidate2.exists():

target = candidate2

else:

# Try as relative path from repo root

candidate = (REPO_ROOT / name).resolve()

if candidate.exists() and _is_safe_path(candidate, [REPO_ROOT]):

target = candidate

else:

# Try docs subdir

candidate2 = (DOCS_DIR / name).resolve()

if candidate2.exists() and _is_safe_path(candidate2, [DOCS_DIR]):

target = candidate2

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@mcp_server.py` around lines 185 - 194, The code builds filesystem paths from the user-controlled variable name (used in candidate/candidate2) without ensuring the resolved path remains inside allowed roots (REPO_ROOT or DOCS_DIR), enabling path traversal; fix by resolving the candidate paths and explicitly checking that each resolved path is a descendant of its allowed base before assigning to target — e.g., after computing candidate = (REPO_ROOT / name).resolve() verify candidate.is_relative_to(REPO_ROOT.resolve()) (or fall back to comparing string prefixes of resolved paths) and only accept it if true, and do the same for candidate2 against DOCS_DIR; if neither check passes, reject the request or return an error.

coderabbitai · 2026-06-11T21:04:15Z

+    try:
+        content = target.read_text(encoding="utf-8")
+        return {"found": True, "name": name, "content": content, "error": ""}
+    except Exception as e:
+        return {"found": False, "name": name, "content": "", "error": str(e)}


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Bare except Exception pattern in both get_docs and search_docs.

Both functions (get_docs at line 207, search_docs at line 250) use bare except Exception which masks specific error causes and makes debugging difficult. The root cause is overly broad exception handling. Replace with specific exception types (OSError, UnicodeDecodeError) and add logging.

🧰 Tools

🪛 Ruff (0.15.15)

[warning] 207-207: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@mcp_server.py` around lines 204 - 208, Replace the broad except Exception in get_docs and search_docs with targeted exception handlers: catch OSError for filesystem/read errors and UnicodeDecodeError for encoding issues when calling target.read_text(), and use the module logger (or processLogger) to log the full exception details (including stack/str) before returning the error payload; keep the returned structure the same but populate "error" with the logged exception message for easier debugging.

Source: Linters/SAST tools

coderabbitai · 2026-06-11T21:04:15Z

+## Configuration
+
+Environment variables:
+- `OPENAI_API_KEY` — required for AI analysis
+- Copy `.env.example` to `.env` to set locally
+
+## Notes
+
+- Works with Python 3.8+
+- Tested on macOS and Linux
+- The CLI wraps `git` — it must be run inside a git repository
+- Large diffs may be slow; consider scoping with file paths or commit ranges
+- The `.env` file is git-ignored — never commit API keys


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check for .env.example file if [ -f ".env.example" ]; then echo ".env.example found ✓" else echo ".env.example NOT found ✗" fi # Check .gitignore for .env entry if grep -q "^\.env$" .gitignore 2>/dev/null; then echo ".env in .gitignore ✓" else echo ".env NOT in .gitignore ✗" fi # Look for Python version requirement fd -t f '(pyproject.toml|setup.py|setup.cfg|.python-version)' && \ echo "Checking Python version requirement:" && \ rg -A2 "requires-python|python_requires|target.version" --type toml --type python || echo "No explicit Python version found"

Repository: WildestAI/DiffGraph-CLI

Length of output: 237

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== skill.md (lines 70-95) ==" if [ -f "skill.md" ]; then sed -n '70,95p' skill.md | nl -ba else echo "skill.md not found" fi echo echo "== Python version declarations (repo-wide) ==" rg -n --hidden --no-ignore-vcs -S \ 'requires-python|python_requires|target\.version|\.python-version|setup-python|actions/setup-python|Programming Language :: Python ::' . echo echo "== setup.py python_requires excerpt ==" if [ -f "setup.py" ]; then rg -n 'python_requires' setup.py || true rg -n --context 3 'classifiers\s*=|Programming Language :: Python ::' setup.py || true else echo "setup.py not found" fi echo echo "== pyproject.toml requires-python excerpt ==" if [ -f "pyproject.toml" ]; then rg -n 'requires-python' pyproject.toml || true rg -n --context 3 'tool\.poetry|project\.requires-python|project\s*\{' pyproject.toml || true else echo "pyproject.toml not found" fi echo echo "== .python-version file ==" if [ -f ".python-version" ]; then echo ".python-version exists: $(cat .python-version)" else echo ".python-version not found" fi echo echo "== CI (workflows) setup-python versions ==" if [ -d ".github/workflows" ]; then rg -n --hidden --no-ignore-vcs -S 'setup-python|python-version' .github/workflows || true else echo ".github/workflows not found" fi

Repository: WildestAI/DiffGraph-CLI

Length of output: 157

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== skill.md (lines 70-95) ==" if [ -f "skill.md" ]; then awk 'NR>=70 && NR<=95 {printf "%d:%s\n", NR, $0}' skill.md else echo "skill.md not found" fi echo echo "== .gitignore (.env entry) ==" if [ -f ".gitignore" ]; then rg -n '^\s*\.env\s*$' .gitignore || true echo "-- matches for '.env' in .gitignore:" rg -n '\.env' .gitignore || true else echo ".gitignore not found" fi echo echo "== Check for python minimum version declarations ==" for f in pyproject.toml setup.py setup.cfg .python-version .github/workflows/*.yml .github/workflows/*.yaml; do if [ -e "$f" ]; then echo "-- $f" rg -n 'requires-python|python_requires|\.python-version|setup-python|python-version|Programming Language :: Python ::' "$f" || true fi done echo echo "== List likely python config files found ==" fd -t f -d 4 -g 'pyproject.toml' -g 'setup.py' -g 'setup.cfg' -g 'requirements*.txt' -g '.python-version' . 2>/dev/null || true fd -t f -d 5 -g '*.yml' -g '*.yaml' .github/workflows 2>/dev/null || true

Repository: WildestAI/DiffGraph-CLI

Length of output: 1196

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Repo root python config presence ==" ls -la | sed -n '1,200p' | rg -n 'pyproject\.toml|setup\.py|setup\.cfg|\.python-version|requirements.*\.txt|Pipfile|poetry\.lock' || true echo echo "== Find pyproject/setup.cfg/.python-version ==" fd -t f -d 4 -g 'pyproject.toml' -g 'setup.cfg' -g '.python-version' . 2>/dev/null || true echo echo "== Find CI python versions (workflows) ==" fd -t f -d 6 -g '*.yml' -g '*.yaml' .github/workflows 2>/dev/null || true rg -n 'setup-python|python-version|actions/setup-python' .github/workflows 2>/dev/null || true echo echo "== Repo-wide python version hints ==" rg -n 'requires-python|python_requires|\.python-version|setup-python|python-version|Programming Language :: Python ::' . || true

Repository: WildestAI/DiffGraph-CLI

Length of output: 724

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== .env.example existence ==" if [ -f ".env.example" ]; then echo ".env.example found" sed -n '1,120p' .env.example | awk '{print NR":"$0}' else echo ".env.example NOT found" fi

Repository: WildestAI/DiffGraph-CLI

Length of output: 261

Fix Python version requirement mismatch in skill.md

.env.example exists at the repo root, and .gitignore ignores .env (while keeping !.env.example), so that guidance is correct.

skill.md says “Works with Python 3.8+” (line 82), but setup.py declares python_requires=">=3.7" (line 15); align these by updating skill.md to 3.7+ or tightening setup.py to 3.8+.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@skill.md` around lines 74 - 86, skill.md currently states "Works with Python 3.8+" but setup.py's python_requires=">=3.7"; update the documentation to match the package metadata by changing the text in skill.md from "Works with Python 3.8+" to "Works with Python 3.7+" (or alternatively, if you intend to require 3.8+, change setup.py's python_requires to ">=3.8"); locate the string in skill.md and the python_requires in setup.py to keep both consistent.

Nia (Avikalp's assistant) added 2 commits June 8, 2026 22:03

coderabbitai Bot requested changes Jun 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add AI discoverability layer (llms.txt, skill.md, MCP server)#17

feat: add AI discoverability layer (llms.txt, skill.md, MCP server)#17
avikalpg wants to merge 2 commits into
mainfrom
nia/ai-discoverability

avikalpg commented Jun 11, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

avikalpg commented Jun 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Files added

MCP server tools

Why

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

avikalpg commented Jun 11, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading