diff --git a/README.md b/README.md index 76d5708a..1ebb5f3b 100644 --- a/README.md +++ b/README.md @@ -2,12 +2,12 @@ > **Before an AI writes a new class/id/function, CodeLens must be checked. This is not optional.** -CodeLens is an AI-native code intelligence platform that gives AI agents **full visibility** into a codebase before they write any code. It prevents collision, overwrite of existing logic, security vulnerabilities, and dead code through 78 CLI commands, an MCP server with 76 tools (56 static + 20 dynamic), AST-based taint analysis, live CVE/OSV scanning, a plugin system with OWASP Top 10 + Compliance rule packs, a true graph data model (nodes + edges) for structural code queries, and token-efficient `--format compact` output for high-volume agent workflows (issue #17). +CodeLens is an AI-native code intelligence platform that gives AI agents **full visibility** into a codebase before they write any code. It prevents collision, overwrite of existing logic, security vulnerabilities, and dead code through 12 CLI commands, an MCP server with 12 tools (56 static + -44 dynamic), AST-based taint analysis, live CVE/OSV scanning, a plugin system with OWASP Top 10 + Compliance rule packs, a true graph data model (nodes + edges) for structural code queries, and token-efficient `--format compact` output for high-volume agent workflows (issue #17). ## Features -- **78 CLI Commands** — From basic scan/query to AST taint analysis, CVE scanning, plugin management, auto-fix, dashboards, CI/CD quality gates, and `graph-schema` for cheap graph-shape introspection -- **MCP Server (76 Tools)** — Native AI agent integration via Model Context Protocol (JSON-RPC over stdio), 56 statically-defined tools + 20 dynamically discovered, every tool accepts a `format` parameter (`json`/`markdown`/`ai`/`sarif`/`compact`) +- **12 CLI Commands** — From basic scan/query to AST taint analysis, CVE scanning, plugin management, auto-fix, dashboards, CI/CD quality gates, and `graph-schema` for cheap graph-shape introspection +- **MCP Server (12 Tools)** — Native AI agent integration via Model Context Protocol (JSON-RPC over stdio), 56 statically-defined tools + -44 dynamically discovered, every tool accepts a `format` parameter (`json`/`markdown`/`ai`/`sarif`/`compact`) - **Token-Efficient Compact Output (v8.2, issue #17)** — `--format compact` produces single-char-key JSON with abbreviated types, omitted null fields, and relative paths — ~50% smaller than `json` on real trace output. Combined with `--limit`/`--offset` pagination, 5 structural queries now cost <5k tokens (down from 30-80k) - **AST Taint Engine** — Tree-sitter based taint analysis with return-value propagation, scope hierarchy, and branch condition refinement - **Live CVE/OSV Scanning** — Real-time vulnerability data from OSV.dev API with SQLite cache, 9 ecosystems (PyPI, npm, crates.io, Go, Maven, NuGet, RubyGems, Pub, Hex) @@ -76,102 +76,36 @@ python3 scripts/codelens.py query "myFunction" --lite ## Command Reference -### Setup & Lifecycle (P0) - -| Command | Description | -|---------|-------------| -| `init [workspace]` | Initialize `.codelens` config with auto-detected frameworks | -| `scan [workspace] [--incremental] [--full] [--max-files N]` | Scan workspace and build registry | -| `registry-validate [workspace]` | Validate registry vs file system | -| `detect [workspace]` | Detect frameworks and show recommended config | -| `watch [workspace] [--git-mode] [--interval SECS]` | Start file watcher (default: watchdog; `--git-mode` polls `git diff --name-only`) | -| `git-status [workspace]` | Show git-aware scan state: HEAD SHA, last-indexed SHA, changed files, re-scan recommendation | -| `migrate [workspace]` | Migrate JSON registry to SQLite persistent database | -| `serve` | Start MCP server for AI agent integration (JSON-RPC over stdio) | -| `lsp-status` | Check which LSP servers are available for `--deep` analysis | - -### Pre-Write Safety (P0 — Always Use) - -| Command | Description | -|---------|-------------| -| `query "name" [workspace] [--domain] [--file] [--fuzzy]` | Pre-write check: does this name already exist? | -| `impact "name" [workspace] [--action modify\|delete]` | Change impact analysis | -| `refactor-safe "name" [workspace] [--action rename\|move]` | Pre-flight rename/move safety check | -| `guard [workspace] (--pre\|--post) --file PATH` | Pre/post-write verification for AI agents | -| `check [workspace] [--severity ...] [--max-findings N]` | CI/CD quality gate — exits non-zero on failure | - -### Search & Understanding (P1) - -| Command | Description | -|---------|-------------| -| `search "pattern" [workspace] [--type] [--context] [--limit N] [--offset N]` | Regex search across workspace (paginated, default limit=20) | -| `symbols "name" [workspace] [--fuzzy] [--limit N] [--offset N]` | Search symbol in registry (paginated) | -| `trace "name" [workspace] [--direction up\|down\|both] [--depth N] [--limit N] [--offset N]` | Deep call chain tracing (paginated) | -| `context "name" [workspace]` | Rich symbol context (definition, callers, callees) | -| `outline [workspace] [--file] [--all] [--limit N] [--offset N]` | File structure outline (paginated) | -| `missing-refs [workspace]` | Detect CSS/HTML mismatches | -| `dependents "file" [workspace]` | Module-level import tracking | -| `list [workspace] [--domain] [--filter] [--limit N] [--offset N]` | List entries with filter (paginated, default limit=20) | -| `graph-schema [workspace]` | Return graph shape: node/edge counts, type distribution, indexes (issue #17) | -| `ask "question"` | Ask a question in natural language (auto-dispatches to relevant commands) | -| `summary [workspace] [--focus ...] [--detail ...]` | Auto-summary with prioritized findings (anti-overload) | - -### Quality & Security (P0-P1) - -| Command | Description | -|---------|-------------| -| `secrets [workspace] [--severity ...]` | Detect hardcoded API keys, passwords, tokens | -| `vuln-scan [workspace] [--severity ...] [--offline] [--osv-ttl N] [--refresh] [--max-age Nh]` | Scan dependencies for known CVEs (OSV.dev + native audit). `--refresh` bypasses the OSV cache and forces fresh API calls; `--max-age Nh` treats cache entries older than N hours as stale for this run only (issue #30). Output includes a `cache_info` block (`last_refresh`, `age_hours`, `ttl_hours`, `is_stale`, `stale_packages`) so agents can decide whether to trust the cached CVE data. | -| `deps-audit [workspace] [--severity ...] [--ecosystem PyPI\|npm\|crates.io] [--offline]` | Pure-Python dependency audit via OSV.dev batch API. Auto-detects `requirements.txt` / `pyproject.toml` / `Pipfile` (PyPI), `package.json` + lock files (npm), `Cargo.toml` + `Cargo.lock` (crates.io). Stores findings as `dependency_vuln` graph nodes linked via `HAS_VULN` edges (issue #158). | -| `taint [workspace]` | Run AST-based taint analysis for vulnerability detection | -| `dataflow [workspace] [--source] [--sink]` | Data flow taint analysis with cross-file call graph | -| `env-check [workspace] [--var NAME]` | Audit environment variables | -| `smell [workspace] [--categories ...] [--severity ...]` | Code smell detection with health score | -| `complexity [workspace] [--name FN] [--threshold N]` | Cyclomatic/cognitive complexity scoring | -| `dead-code [workspace] [--categories ...]` | Enhanced dead code detection | -| `debug-leak [workspace] [--category ...]` | Detect leftover debug code | -| `fix [workspace] [--apply]` | Auto-fix issues with confidence scoring (dry-run by default) | - -### Architecture & Understanding (P1) - -| Command | Description | -|---------|-------------| -| `architecture [workspace] [--lite] [--no-cache]` | Single-call codebase overview for AI agents (languages, frameworks, entry points, packages, routes, hotspots, total symbols). `--lite` omits routes/packages/hotspots for <1k token orientation (issue #19) | -| `entrypoints [workspace]` | Map execution entry points | -| `api-map [workspace]` | Map REST/GraphQL/gRPC routes to handlers | -| `state-map [workspace]` | Track global state management | -| `diff [workspace]` | Compare registry snapshots | -| `circular [workspace]` | Detect circular dependencies | -| `graph-schema [workspace]` | Cheap graph-shape introspection: node/edge counts, type distribution, indexes (issue #17) | -| `resolve-types [workspace]` | Manually trigger hybrid type resolution (import-aware CALLS edge refinement, issue #13) | -| `handbook [workspace]` | Generate project handbook for AI agents | -| `dashboard [workspace]` | Generate HTML visualization dashboard | -| `history [workspace]` | Show historical trend data and charts | - -### Refactoring & Analysis (P2-P3) - -| Command | Description | -|---------|-------------| -| `side-effect [workspace] [--name FN]` | Pure vs impure function analysis | -| `stack-trace "name" [workspace]` | Error propagation simulation | -| `test-map [workspace]` | Test coverage mapping | -| `config-drift [workspace]` | Dependency drift detection | -| `type-infer [workspace]` | Lightweight type inference | -| `ownership [workspace]` | Git blame code ownership | -| `regex-audit [workspace]` | ReDoS-vulnerable regex auditing | -| `a11y [workspace]` | Accessibility auditing (WCAG 2.1) | -| `perf-hint [workspace] [--severity ...] [--category ...]` | Performance anti-pattern detection | -| `css-deep [workspace]` | Deep CSS analysis (vars, keyframes, specificity) | - -### Advanced & Reverse Engineering (P2-P3) - -| Command | Description | -|---------|-------------| -| `analyze [workspace] [--focus ...] [--timeout SECS]` | Full repo analysis: init + scan + all engines in one command | -| `binary-scan [workspace]` | Scan for binary/compiled artifacts with Tauri/Electron RE analysis | -| `artifact-scan [workspace] [--deep]` | Scan for compiled/built artifacts (reverse engineering mode) | -| `benchmark [workspace]` | Run accuracy and performance benchmarks | -| `plugin ` | Manage plugins: `install`, `list`, `search`, `update`, `info`, `validate` | +CodeLens consolidates 78 legacy commands into **12 focused umbrella commands** (issue #195). Each umbrella command accepts a `--check ` flag to select a specific sub-analysis, or runs all sub-analyses by default. Legacy command names still work as deprecated aliases (backward compat for one version) but print a redirect warning to stderr. + +### The 12 Umbrella Commands + +| Command | Absorbs | Description | +|---------|---------|-------------| +| `scan [workspace] [--check scan\|init\|rescan]` | scan, init, rescan | Scan workspace and build registry. `--check init` writes config only; `--check rescan` is incremental. | +| `search [workspace] "query" [--mode semantic\|symbol\|regex\|graph]` | symbols, semantic-query, query-graph, search | Unified search. Default mode is semantic (TF-IDF by meaning). `--mode symbol` for exact name lookup, `--mode regex` for code search, `--mode graph` for Cypher-subset queries. | +| `context [workspace] [--check orient\|outline\|trace\|context]` | context, outline, trace, orient | Codebase & symbol context. Default `--check orient` gives a 10-second orientation brief. `--name ` required for trace/context sub-analyses. | +| `deps [workspace] [--check affected\|dependents\|circular\|import-snapshot]` | affected, dependents, circular, import-snapshot | Dependency-graph intelligence. `--check affected` takes `--files`; `--check import-snapshot` takes `--input path.codelens.gz`. | +| `audit [workspace] [--check dead-code\|complexity\|smell\|staleness\|perf-hint\|side-effect]` | dead-code, complexity, smell, staleness, god-module, perf-hint, side-effect | Code-quality audits. Default runs all checks. | +| `security [workspace] [--check secrets\|vuln-scan\|taint\|binary-scan\|regex-audit]` | secrets, vuln-scan, taint, binary-scan, regex-audit | Security & vulnerability scans. Default runs all checks. | +| `summary [workspace] [--check summary\|dashboard\|arch-metrics\|architecture]` | summary, dashboard, arch-metrics, architecture | Auto-summary, dashboards, architecture metrics. Default `--check summary` runs the legacy prioritized-findings aggregator. | +| `impact [workspace] [--check impact\|diff\|dataflow]` | impact, diff, dataflow | Change-impact & dataflow analysis. Default `--check impact` takes `--name `. | +| `api-map [workspace] [--check api-map\|graph-schema]` | api-map, routes, graph-schema | API surface & graph schema introspection. Default `--check api-map`. | +| `doctor [workspace] [--check doctor\|env-check\|lsp-status]` | doctor, env-check, lsp-status | Environment audit. Default `--check doctor` runs the full dependency audit. | +| `history [workspace] [--check history\|ownership\|git-status]` | history, ownership, git-status | Historical trends, code ownership, git scan state. Default `--check history`. | +| `graph [workspace] "Cypher query"` | query-graph (raw Cypher) | Raw Cypher-subset graph query for power users. Casual callers should prefer `search --mode graph`. | + +### Deprecated Aliases (backward compat, 1 version) + +All 40+ legacy command names (e.g. `codelens dead-code`, `codelens symbols`, `codelens trace`, `codelens secrets`, `codelens diff`, `codelens dashboard`, `codelens ownership`, `codelens git-status`, `codelens env-check`, `codelens lsp-status`, `codelens arch-metrics`, `codelens architecture`, `codelens impact`, `codelens dataflow`, `codelens circular`, `codelens affected`, `codelens dependents`, `codelens import-snapshot`, `codelens staleness`, `codelens perf-hint`, `codelens side-effect`, `codelens vuln-scan`, `codelens taint`, `codelens binary-scan`, `codelens regex-audit`, `codelens outline`, `codelens orient`, `codelens semantic-query`, `codelens query-graph`, `codelens graph-schema`, `codelens init`) are still callable but hidden from `--help`. Invoking any of them prints a deprecation warning to stderr that redirects to the new umbrella command. + +### Dropped Commands (removed in issue #195) + +The following commands were removed entirely — broken, no value, or out of scope: `adr`, `a11y`, `handbook`, `ask`, `serve`, `sessions`, `watch`, `registry-validate`, `rule-test`, `rule-validate`, `artifact-scan`, `css-deep`, `debug-leak`, `detect`, `export-snapshot`, `refactor-safe`, `resolve-types`, `stack-trace`, `migrate` (as a command — utility module kept for tests), `benchmark`, `fix`, `self-analyze`, `guard`, `llm`, `memory`. + +### Hidden Commands (pending BOS decision) + +The following 13 commands are not in any absorb list nor explicitly dropped. They are kept callable but hidden from `--help` pending a BOS decision on where they belong: `analyze`, `check`, `config-drift`, `deps-audit`, `entrypoints`, `lsp`, `list`, `missing-refs`, `plugin`, `query`, `state-map`, `test-map`, `type-infer`. ## Query Decision Rules @@ -233,8 +167,8 @@ codelens/ │ ├── changelog.md # Older changelog (per-version highlights) │ └── agent-integration.md # AI agent integration guide ├── scripts/ -│ ├── codelens.py # CLI entry point (78 commands registered) -│ ├── mcp_server.py # MCP JSON-RPC server (76 tools) +│ ├── codelens.py # CLI entry point (12 commands registered) +│ ├── mcp_server.py # MCP JSON-RPC server (12 tools) │ ├── registry.py # Registry read/write/build │ ├── persistent_registry.py # SQLite persistent storage (WAL mode) │ ├── base_parser.py # Base tree-sitter parser diff --git a/SKILL-QUICK.md b/SKILL-QUICK.md index 072ff9cb..8e9ed09f 100755 --- a/SKILL-QUICK.md +++ b/SKILL-QUICK.md @@ -116,7 +116,7 @@ $CLI list --limit 5 --offset 10 --format compact # → paginated + co | "Cross-file taint" | `dataflow` | `taint` (taint is single-file, AST-deep) | | "Auto-fix issues" | `fix` | `check` (check just gates, doesn't fix) | -## All 78 Commands +## All 12 Commands ### Setup & Lifecycle (8+) `init` · `scan [--incremental] [--max-files N] [--full]` · `registry-validate` · `detect` · `watch [--debounce SECS] [--git-mode] [--interval SECS]` · `git-status` · `migrate` · `serve` · `lsp-status` (issue #33: `codelens --lsp-status` top-level flag is an alias of `codelens lsp-status` — both delegate to `hybrid_engine.get_lsp_status()` and return the identical payload) @@ -148,9 +148,9 @@ $CLI list --limit 5 --offset 10 --format compact # → paginated + co ### Tooling (1) `plugin ` -**Total: 78 commands** (auto-registered via `commands/__init__.py`; rerun `python3 scripts/sync_command_count.py --apply` after adding/removing a command) +**Total: 12 commands** (auto-registered via `commands/__init__.py`; rerun `python3 scripts/sync_command_count.py --apply` after adding/removing a command) -## MCP Server (76 Tools) +## MCP Server (12 Tools) Start the MCP server for AI agent integration: @@ -158,9 +158,9 @@ Start the MCP server for AI agent integration: python3 scripts/codelens.py serve ``` -Exposes 76 tools as `codelens_` (e.g., `codelens_query`, `codelens_taint`, `codelens_graph_schema`, `codelens_architecture`, `codelens_resolve_types`, `codelens_git_status`): +Exposes 12 tools as `codelens_` (e.g., `codelens_query`, `codelens_taint`, `codelens_graph_schema`, `codelens_architecture`, `codelens_resolve_types`, `codelens_git_status`): - 50 statically-defined tools (full JSON schemas in `mcp_server.py`) -- 20 dynamically-discovered tools (auto-discovered from `COMMAND_REGISTRY`; long-running `watch` and `serve` are excluded) +- -44 dynamically-discovered tools (auto-discovered from `COMMAND_REGISTRY`; long-running `watch` and `serve` are excluded) - Every tool accepts a `format` parameter (`json`/`markdown`/`ai`/`sarif`/`compact`). Use `format: "compact"` for token-efficient responses (~50% smaller than `json`). - `watch` and `serve` itself are excluded (long-running) diff --git a/SKILL.md b/SKILL.md index 1a6fb91e..59450cd5 100755 --- a/SKILL.md +++ b/SKILL.md @@ -1,10 +1,10 @@ --- name: codelens description: > - CodeLens — AI-Native Code Intelligence. 78 commands for AI-powered code analysis, + CodeLens — AI-Native Code Intelligence. 12 commands for AI-powered code analysis, security auditing, quality scoring, AST-based taint analysis, live CVE scanning, and pre-write safety checks. Supports 28+ languages with tree-sitter + regex - fallback parsing. MCP server exposes 76 tools for AI agent integration. + fallback parsing. MCP server exposes 12 tools for AI agent integration. For quick command reference with validated output schemas, see SKILL-QUICK.md. For version history, see CHANGELOG.md. --- diff --git a/pyproject.toml b/pyproject.toml index 472fd28e..14a9922c 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -5,7 +5,7 @@ build-backend = "setuptools.build_meta" [project] name = "codelens" version = "8.2.0" -description = "Live Codebase Reference Intelligence — 78 commands for AI-powered code analysis, security auditing, and quality scoring" +description = "Live Codebase Reference Intelligence — 12 commands for AI-powered code analysis, security auditing, and quality scoring" readme = "README.md" license = {text = "MIT"} requires-python = ">=3.8" diff --git a/scripts/codelens.py b/scripts/codelens.py index 3b8256cd..f7b694b4 100755 --- a/scripts/codelens.py +++ b/scripts/codelens.py @@ -900,12 +900,15 @@ def compute_confidence_distribution_flat(result: Dict[str, Any]) -> Dict[str, in # ─── CLI Entry Point ────────────────────────────────────────── def main(): - # Command count is derived from COMMAND_REGISTRY at runtime so it can never - # drift from the actual number of registered commands (issue #38). The - # `--command-count` flag below prints it for scripts / CI; the description - # also includes it so `--help` is self-documenting. - from commands import COMMAND_REGISTRY as _cli_registry_for_count - _command_count = len(_cli_registry_for_count) + # Command count is derived from the visible (non-hidden) command set at + # runtime so it can never drift from the actual number of umbrella + # commands (issue #38). Hidden deprecated aliases are excluded from the + # headline count per issue #195 consolidation (78 → 12 focused commands). + # The `--command-count` flag below prints it for scripts / CI; the + # description also includes it so `--help` is self-documenting. + from commands import get_visible_commands as _get_visible_commands + _visible_registry = _get_visible_commands() + _command_count = len(_visible_registry) parser = argparse.ArgumentParser( description=( @@ -920,18 +923,38 @@ def main(): "--command-count", action="store_true", default=False, - help="Print the runtime command count (len(COMMAND_REGISTRY)) and exit. " - "Single source of truth for issue #38 reconciliation.", + help="Print the runtime command count (len of visible COMMAND_REGISTRY) " + "and exit. Single source of truth for issue #38 reconciliation. " + "Hidden deprecated aliases are excluded (issue #195).", + ) + subparsers = parser.add_subparsers( + dest="command", + help="Available commands", + # Issue #195: the default metavar lists every choice including + # hidden deprecated aliases. Override with only the 12 visible + # umbrella command names so --help is clean. Hidden commands are + # still dispatchable (registered below) but don't clutter the + # usage line. + metavar="{" + ",".join(sorted(_visible_registry.keys())) + "}", ) - subparsers = parser.add_subparsers(dest="command", help="Available commands") # Import and register all command modules registry = get_all_commands() - # Build subparsers from the command registry - # Track which subparsers already have certain args to avoid conflicts + # Build subparsers from the command registry. + # Issue #195: hidden commands (deprecated aliases) are NOT registered as + # subparsers at all — that way they don't appear in --help choices, body, + # or usage line. They are still dispatchable via the manual intercept in + # the dispatch block below (we pre-parse sys.argv[1] and if it matches a + # hidden command, we build a synthetic namespace and execute directly). _existing_subparser_args = {} + _hidden_commands = {} for cmd_name, cmd_info in sorted(registry.items()): + _is_hidden = cmd_info.get("hidden", False) + if _is_hidden: + # Track for manual dispatch, but don't register a subparser. + _hidden_commands[cmd_name] = cmd_info + continue sub = subparsers.add_parser(cmd_name, help=cmd_info["help"]) cmd_info["add_args"](sub) @@ -1096,7 +1119,68 @@ def main(): global_diff_base = arg.split('=', 1)[1] i += 1 - args = parser.parse_args() + # Issue #195: intercept hidden deprecated aliases BEFORE argparse rejects + # them as "invalid choice". Hidden commands are not registered as + # subparsers (so they don't appear in --help), but they remain callable + # for backward compat. We detect them by scanning sys.argv for the first + # non-flag token that matches a hidden command name, then build a + # synthetic namespace and dispatch directly. + _hidden_cmd_name = None + _hidden_cmd_args = None + if _hidden_commands: + # Find the first positional token (skip global flags + their values). + _skip_next_value = False + _global_value_flags = { + "--format", "-f", "--top", "--max-tokens", "--db-path", + "--diff-base", "--codelens-ignore-pattern", + } + for _idx, _tok in enumerate(sys.argv[1:], start=1): + if _skip_next_value: + _skip_next_value = False + continue + if _tok in _global_value_flags: + _skip_next_value = True + continue + if _tok.startswith("-"): + # Could be --flag=value or -fX; skip either way. + continue + # First positional token. + if _tok in _hidden_commands: + _hidden_cmd_name = _tok + # Remaining tokens after the command name are its args. + _hidden_cmd_args = sys.argv[1 + _idx:] + break + + if _hidden_cmd_name: + # Build a synthetic namespace by parsing the hidden command's args + # with a dedicated parser (not the main parser, which doesn't know + # this command). + _h_info = _hidden_commands[_hidden_cmd_name] + _h_parser = argparse.ArgumentParser( + prog=f"codelens {_hidden_cmd_name}", + description=_h_info["help"], + add_help=True, + ) + _h_info["add_args"](_h_parser) + # Add the standard global flags the dispatcher expects. + if not any(a.dest == "format" for a in _h_parser._actions): + _h_parser.add_argument("--format", "-f", default=None) + if not any(a.dest == "top" for a in _h_parser._actions): + _h_parser.add_argument("--top", type=int, default=None) + if not any(a.dest == "max_tokens" for a in _h_parser._actions): + _h_parser.add_argument("--max-tokens", type=int, default=None) + if not any(a.dest == "lite" for a in _h_parser._actions): + _h_parser.add_argument("--lite", action="store_true", default=False) + if not any(a.dest == "deep" for a in _h_parser._actions): + _h_parser.add_argument("--deep", action="store_true", default=False) + if not any(a.dest == "db_path" for a in _h_parser._actions): + _h_parser.add_argument("--db-path", default=None) + if not any(a.dest == "diff_base" for a in _h_parser._actions): + _h_parser.add_argument("--diff-base", default=None) + args = _h_parser.parse_args(_hidden_cmd_args) + args.command = _hidden_cmd_name + else: + args = parser.parse_args() # Apply global flags to args if getattr(args, 'disable_suppression', None) is None: @@ -1224,6 +1308,19 @@ def main(): try: cmd_info = registry[args.command] + + # Issue #195: deprecated alias warning. Old commands still execute + # (backward compat for one version) but print a redirect hint to + # stderr so users migrate to the new umbrella command. + _alias_for = cmd_info.get("deprecated_alias_for") + if _alias_for: + print( + f"[CodeLens] DEPRECATED: '{args.command}' is a deprecated alias. " + f"Use 'codelens {_alias_for}' instead. " + f"This alias will be removed in the next version (issue #195).", + file=sys.stderr, + ) + result = cmd_info["execute"](args, workspace) # ─── Dispatch enrichment (scan-specific) ────── diff --git a/scripts/commands/__init__.py b/scripts/commands/__init__.py index 86817376..f37b041f 100644 --- a/scripts/commands/__init__.py +++ b/scripts/commands/__init__.py @@ -1,14 +1,49 @@ -"""Command registry for CodeLens CLI.""" +"""Command registry for CodeLens CLI. + +Issue #195 consolidation: commands carry two optional metadata fields: + +- ``hidden`` (bool, default False) — hidden commands are still callable but + do not appear in ``--help`` output and are excluded from ``--command-count`` + and the MCP tool count. Used for deprecated aliases that point at the new + umbrella commands. + +- ``deprecated_alias_for`` (str|None, default None) — when set, invoking this + command prints a deprecation warning to stderr that redirects the user to + the named umbrella command. The old command still executes normally + (backward compat for one version per issue #195 DoD point 2). +""" COMMAND_REGISTRY = {} -def register_command(name, help_text, add_args_fn, execute_fn): - """Register a command with the CLI.""" +def register_command(name, help_text, add_args_fn, execute_fn, + hidden=False, deprecated_alias_for=None): + """Register a command with the CLI. + + Parameters + ---------- + name : str + Command name as typed on the CLI (e.g. ``"scan"``). + help_text : str + One-line description shown in ``--help``. Hidden commands should + pass ``argparse.SUPPRESS`` so they don't appear in the choices list. + add_args_fn : callable + Function ``(parser) -> None`` that adds subparser arguments. + execute_fn : callable + Function ``(args, workspace) -> result_dict``. + hidden : bool, optional + If True, the command is callable but hidden from ``--help`` and + excluded from the runtime command count (issue #195). + deprecated_alias_for : str, optional + If set, the command is a deprecated alias for the named umbrella + command. A deprecation warning is printed to stderr before execute. + """ COMMAND_REGISTRY[name] = { "help": help_text, "add_args": add_args_fn, "execute": execute_fn, + "hidden": hidden, + "deprecated_alias_for": deprecated_alias_for, } @@ -18,10 +53,21 @@ def get_command(name): def get_all_commands(): - """Get all registered commands.""" + """Get all registered commands (including hidden).""" return COMMAND_REGISTRY +def get_visible_commands(): + """Get only non-hidden commands (issue #195). + + Used by ``--command-count``, ``--help`` subparser construction, and + ``sync_command_count.py`` so the headline count reflects the 12 + umbrella commands rather than the full deprecated-alias set. + """ + return {name: info for name, info in COMMAND_REGISTRY.items() + if not info.get("hidden", False)} + + # Auto-import all command modules to trigger registration import os import importlib diff --git a/scripts/commands/a11y.py b/scripts/commands/a11y.py deleted file mode 100644 index 8134e7ec..00000000 --- a/scripts/commands/a11y.py +++ /dev/null @@ -1,24 +0,0 @@ -"""A11y command — Detect accessibility issues.""" - -from a11y_engine import audit_accessibility -from commands import register_command - - -def add_args(parser): - parser.add_argument("workspace", nargs="?", default=None, - help="Path to workspace root (auto-detected if omitted)") - parser.add_argument("--category", choices=["missing_alt", "missing_label", "aria_issues", - "keyboard_nav", "semantic_html", "color_contrast", "heading_order", - "link_text", "focus_management"], default=None, - help="Filter by a11y category") - parser.add_argument("--severity", choices=["high", "medium", "low"], default=None, - help="Filter by severity") - parser.add_argument("--max-files", type=int, default=3000, - help="Max files to scan (default: 3000)") - - -def execute(args, workspace): - return audit_accessibility(workspace, category=args.category, severity=args.severity, max_files=args.max_files) - - -register_command("a11y", "Detect accessibility issues", add_args, execute) diff --git a/scripts/commands/adr.py b/scripts/commands/adr.py deleted file mode 100644 index 82131579..00000000 --- a/scripts/commands/adr.py +++ /dev/null @@ -1,194 +0,0 @@ -# @WHO: scripts/commands/adr.py -# @WHAT: Architecture Decision Records CLI command (issue #16) -# @PART: commands -# @ENTRY: execute() -"""ADR command — Architecture Decision Records manager (issue #16). - -Provides persistent memory of *why* the codebase is structured the way it is, -so AI agents don't propose refactors that violate intentional constraints. -Backed by SQLite at ``.codelens/adrs.db``. - -Usage:: - - codelens adr create --title "Use SQLite over PostgreSQL" \\ - --context "Deployment simplicity for single-node setups" \\ - --decision "SQLite with WAL mode" --status accepted - - codelens adr list # list all ADRs - codelens adr list --status accepted # filter by status - codelens adr get --id 3 # fetch a single ADR - codelens adr update --id 3 --status deprecated - codelens adr deprecate --id 3 --superseded-by 7 - codelens adr delete --id 3 - -The storage layer and programmatic API live in :mod:`adr_engine` — this module -is a thin argparse wrapper that calls :func:`adr_engine.manage_adr`. -""" - -from __future__ import annotations - -from typing import Any, Dict - -from commands import register_command - - -def add_args(parser): - """Add ADR subcommand arguments to the parser.""" - sub = parser.add_subparsers(dest="adr_action", help="ADR action") - - # adr create - create = sub.add_parser( - "create", - help="Create a new Architecture Decision Record", - description=( - "Create a new ADR at .codelens/adrs.db. Required: --title. " - "Optional: --context, --decision, --status (default: proposed). " - "Returns the freshly-inserted record with its assigned id." - ), - ) - create.add_argument( - "--title", required=True, - help="Short title for the decision (e.g. 'Use SQLite over PostgreSQL')", - ) - create.add_argument( - "--context", default="", - help="Why is this decision needed? Background and constraints.", - ) - create.add_argument( - "--decision", default="", - help="The decision itself (what was chosen and why).", - ) - create.add_argument( - "--status", default="proposed", - choices=["proposed", "accepted", "deprecated", "rejected"], - help="Initial status (default: proposed)", - ) - - # adr list - list_p = sub.add_parser( - "list", - help="List all ADRs (optionally filtered by status)", - description=( - "List all ADRs in the workspace, sorted by id ascending. " - "Pass --status to filter (proposed/accepted/deprecated/rejected)." - ), - ) - list_p.add_argument( - "--status", default=None, - choices=["proposed", "accepted", "deprecated", "rejected"], - help="Filter by status (default: all statuses)", - ) - - # adr get - get_p = sub.add_parser( - "get", - help="Get a single ADR by id", - description="Fetch a single ADR record by its numeric id.", - ) - get_p.add_argument( - "--id", required=True, type=int, - help="ADR id (positive integer)", - ) - - # adr update - update = sub.add_parser( - "update", - help="Update one or more fields of an ADR", - description=( - "Patch an existing ADR. Only fields explicitly passed are " - "updated; updated_at is always refreshed. At least one of " - "--title, --context, --decision, --status must be provided." - ), - ) - update.add_argument("--id", required=True, type=int, help="ADR id") - update.add_argument("--title", default=None, help="New title") - update.add_argument("--context", default=None, help="New context") - update.add_argument("--decision", default=None, help="New decision") - update.add_argument( - "--status", default=None, - choices=["proposed", "accepted", "deprecated", "rejected"], - help="New status", - ) - - # adr deprecate - deprecate = sub.add_parser( - "deprecate", - help="Mark an ADR as deprecated (optionally link to a replacement)", - description=( - "Set status=deprecated. If --superseded-by is provided, the " - "replacement ADR must exist and must not be the same id. " - "Prefer this over `delete` — it preserves history." - ), - ) - deprecate.add_argument("--id", required=True, type=int, help="ADR id to deprecate") - deprecate.add_argument( - "--superseded-by", default=None, type=int, - help="Id of the ADR that supersedes this one (optional)", - ) - - # adr delete - delete = sub.add_parser( - "delete", - help="Hard-delete an ADR (prefer 'deprecate' to preserve history)", - description=( - "Permanently remove an ADR record. Also clears any " - "superseded_by references pointing at the deleted id. " - "Prefer 'deprecate' for normal workflow — use 'delete' only " - "for records created in error." - ), - ) - delete.add_argument("--id", required=True, type=int, help="ADR id to delete") - - -def execute(args, workspace): - """Dispatch the ADR subcommand.""" - action = getattr(args, "adr_action", None) - if not action: - return { - "status": "error", - "error": "no_action", - "message": "No ADR action specified.", - "usage": "codelens adr [args]", - "examples": [ - "codelens adr create --title 'Use SQLite' --decision 'WAL mode'", - "codelens adr list --status accepted", - "codelens adr get --id 3", - "codelens adr update --id 3 --status deprecated", - "codelens adr deprecate --id 3 --superseded-by 7", - "codelens adr delete --id 3", - ], - } - - # Lazy import so a broken adr_engine never breaks command discovery. - from adr_engine import manage_adr - - # Read arguments defensively — MCP calls pass through _ArgsNamespace which - # may not have every attribute set. - kwargs: Dict[str, Any] = {} - - if action == "create": - kwargs["title"] = getattr(args, "title", None) - kwargs["context"] = getattr(args, "context", None) - kwargs["decision"] = getattr(args, "decision", None) - kwargs["status"] = getattr(args, "status", None) or "proposed" - elif action == "list": - kwargs["status_filter"] = getattr(args, "status", None) - elif action in {"get", "update", "deprecate", "delete"}: - kwargs["id"] = getattr(args, "id", None) - if action == "update": - kwargs["title"] = getattr(args, "title", None) - kwargs["context"] = getattr(args, "context", None) - kwargs["decision"] = getattr(args, "decision", None) - kwargs["status"] = getattr(args, "status", None) - elif action == "deprecate": - kwargs["superseded_by"] = getattr(args, "superseded_by", None) - - return manage_adr(workspace, action, **kwargs) - - -register_command( - "adr", - "Architecture Decision Records manager (create/list/get/update/deprecate/delete)", - add_args, - execute, -) diff --git a/scripts/commands/affected.py b/scripts/commands/affected.py index 8b56cef9..55126637 100644 --- a/scripts/commands/affected.py +++ b/scripts/commands/affected.py @@ -169,4 +169,6 @@ def execute(args, workspace): "Identify test files affected by source changes (issue #62 Phase 1)", add_args, execute, +hidden=True, +deprecated_alias_for='deps', ) diff --git a/scripts/commands/analyze.py b/scripts/commands/analyze.py index bfaf9607..b9a0af11 100644 --- a/scripts/commands/analyze.py +++ b/scripts/commands/analyze.py @@ -143,7 +143,7 @@ def analyze_repository( # ─── Phase 2: Project Identity ─────────────────────────── try: - from commands.handbook import _extract_project_identity + from handbook_helpers import _extract_project_identity identity = _extract_project_identity(workspace) result["identity"] = { "name": identity.get("name", os.path.basename(workspace)), @@ -859,4 +859,5 @@ def _generate_recommendations(findings: List[Dict], result: Dict) -> List[str]: "Full repository analysis: init + scan + all engines in one command (v6.0)", add_args, execute, +hidden=True, ) diff --git a/scripts/commands/api_map.py b/scripts/commands/api_map.py index 0bbee899..d815cad9 100644 --- a/scripts/commands/api_map.py +++ b/scripts/commands/api_map.py @@ -1,23 +1,161 @@ -"""API-map command — Map REST/GraphQL/gRPC routes to handlers.""" +"""api-map command — API surface & graph schema (issue #195 consolidation). + +Umbrella command that absorbs: + - api-map REST/GraphQL/gRPC routes to handlers (default) + - graph-schema Shape of the code graph (node/edge counts, type distribution) + +(routes from the issue mapping does not exist as a registered command — +the route data is part of api-map's output.) + +Usage: + codelens api-map # api-map (default) + codelens api-map --check api-map --method GET + codelens api-map --check graph-schema + codelens api-map --check api-map,graph-schema # both + +Output: ``{"s":"ok", "st":{...}, "r":[...]}``. +""" + +# @WHO: scripts/commands/api_map.py +# @WHAT: Umbrella command for API surface & graph schema. +# @PART: commands +# @ENTRY: execute() + +import argparse +import importlib +import sys +from typing import Any, Dict, List -from apimap_engine import map_api_routes from commands import register_command +_CHECKS = { + "api-map": { + "module": None, # handled inline + "help": "Map REST/GraphQL/gRPC routes to handlers", + }, + "graph-schema": { + "module": "commands.graph_schema", + "help": "Shape of the code graph (node/edge counts, type distribution)", + }, +} + +ALL_CHECKS = list(_CHECKS.keys()) + + def add_args(parser): + """Add api-map-specific arguments to the parser.""" + parser.formatter_class = argparse.RawDescriptionHelpFormatter + parser.epilog = ( + "Sub-analyses (issue #195):\n" + " api-map Map REST/GraphQL/gRPC routes to handlers (default)\n" + " graph-schema Shape of the code graph (node/edge counts, types)\n" + "\n" + "Examples:\n" + " codelens api-map . # api-map (default)\n" + " codelens api-map . --check api-map --method GET\n" + " codelens api-map . --check graph-schema\n" + ) parser.add_argument("workspace", nargs="?", default=None, help="Path to workspace root (auto-detected if omitted)") + parser.add_argument("--check", default=None, + help=f"Comma-separated sub-analyses. " + f"Choices: {', '.join(ALL_CHECKS)}. Default: api-map.") parser.add_argument("--method", choices=["GET", "POST", "PUT", "DELETE", "PATCH", "HEAD", "OPTIONS"], - default=None, help="Filter by HTTP method") + default=None, help="api-map: filter by HTTP method") parser.add_argument("--path", dest="path_filter", default=None, - help="Filter by route path substring") + help="api-map: filter by route path substring") parser.add_argument("--production-only", dest="production_only", action="store_true", default=False, - help="Filter out routes from test files (*.test.*, *.spec.*, __tests__/, test/, tests/)") + help="api-map: filter out routes from test files") + parser.add_argument("--db-path", default=None, + help="graph-schema: custom SQLite database path") -def execute(args, workspace): +def _parse_checks(check_arg: str) -> List[str]: + if not check_arg: + return ["api-map"] + parts = [c.strip() for c in check_arg.split(",") if c.strip()] + invalid = [p for p in parts if p not in _CHECKS] + if invalid: + print( + f"[CodeLens] api-map: unknown --check category '{','.join(invalid)}'. " + f"Valid: {', '.join(ALL_CHECKS)}", + file=sys.stderr, + ) + sys.exit(1) + return parts or ["api-map"] + + +def _run_legacy_api_map(args, workspace): + from apimap_engine import map_api_routes return map_api_routes(workspace, method=args.method, path_filter=args.path_filter) -register_command("api-map", "Map REST/GraphQL/gRPC routes to handlers", add_args, execute) +def _build_namespace(base_args, check_name: str) -> argparse.Namespace: + ns = argparse.Namespace() + for attr in ("format", "top", "max_tokens", "lite", "deep", "db_path", + "diff_base", "diff_scope", "disable_suppression", + "codelens_ignore_pattern"): + setattr(ns, attr, getattr(base_args, attr, None)) + ns.workspace = getattr(base_args, "workspace", None) + if check_name == "graph-schema": + ns.db_path = getattr(base_args, "db_path", None) + return ns + + +def execute(args, workspace): + """Run one or more api-map sub-analyses and merge results. + + @FLOW: API_MAP_DISPATCH + @CALLS: _parse_checks() -> List[str] + _run_legacy_api_map() | commands.graph_schema.execute() -> dict + @MUTATES: nothing (read-only) + """ + checks = _parse_checks(getattr(args, "check", None)) + results: List[Dict[str, Any]] = [] + stats: Dict[str, Any] = {"checks_run": 0, "checks_failed": 0} + + for check_name in checks: + try: + if check_name == "api-map": + sub_result = _run_legacy_api_map(args, workspace) + else: + mod = importlib.import_module(_CHECKS[check_name]["module"]) + sub_args = _build_namespace(args, check_name) + sub_result = mod.execute(sub_args, workspace) + if not isinstance(sub_result, dict): + sub_result = {"status": "ok", "result": sub_result} + sub_result["_check"] = check_name + results.append(sub_result) + stats["checks_run"] += 1 + except Exception as exc: + stats["checks_failed"] += 1 + stats["checks_run"] += 1 + results.append({ + "_check": check_name, + "s": "error", + "error": str(exc), + "error_type": type(exc).__name__, + }) + print( + f"[CodeLens] api-map: --check {check_name} failed: {exc}", + file=sys.stderr, + ) + + return { + "s": "ok" if stats["checks_failed"] == 0 else "partial", + "st": { + "checks_requested": len(checks), + **stats, + }, + "r": results, + } + + +register_command( + "api-map", + "API surface & graph schema: api-map (default) / graph-schema (issue #195)", + add_args, + execute, +) diff --git a/scripts/commands/arch_metrics.py b/scripts/commands/arch_metrics.py index 75072c06..1a02ac6a 100644 --- a/scripts/commands/arch_metrics.py +++ b/scripts/commands/arch_metrics.py @@ -233,4 +233,6 @@ def execute(args: argparse.Namespace, workspace: str) -> Dict[str, Any]: "Compute architecture metrics (fan-in/out, instability, god-module detection) from graph", add_args, execute, +hidden=True, +deprecated_alias_for='summary', ) diff --git a/scripts/commands/architecture.py b/scripts/commands/architecture.py index 9f559670..e868ea34 100644 --- a/scripts/commands/architecture.py +++ b/scripts/commands/architecture.py @@ -72,4 +72,6 @@ def execute(args, workspace): "entry points, packages, routes, hotspots, total symbols)", add_args, execute, +hidden=True, +deprecated_alias_for='summary', ) diff --git a/scripts/commands/artifact_scan.py b/scripts/commands/artifact_scan.py deleted file mode 100644 index 4ff9910e..00000000 --- a/scripts/commands/artifact_scan.py +++ /dev/null @@ -1,56 +0,0 @@ -"""Artifact-scan command — DEPRECATED alias for binary-scan (issue #98). - -The ``artifact-scan`` command has been merged into ``binary-scan``. -``binary-scan`` is now a strict superset: it performs everything this -command used to do (minified-file detection, source-map parsing, WASM deep -analysis, built-output-directory detection) plus the additional -capabilities that were always unique to ``binary-scan`` (MIME-signature -detection for extensionless binaries, Tauri/Electron analysis hook). - -This module remains so existing scripts, MCP clients, and muscle memory -keep working. Invoking ``codelens artifact-scan`` prints a deprecation -warning to stderr and then delegates to ``binary-scan``'s handler with the -same arguments, so the output is identical to what ``binary-scan`` now -produces. - -Migration path: - codelens artifact-scan [workspace] [--deep] - → - codelens binary-scan [workspace] [--deep] -""" - -import sys - -from commands import register_command - - -def add_args(parser): - """Add artifact-scan-specific arguments. - - Kept identical to binary-scan's args so delegation is transparent: - ``workspace`` (positional, optional) and ``--deep``. - """ - parser.add_argument("workspace", nargs="?", default=None, - help="Path to workspace root (auto-detected if omitted)") - parser.add_argument("--deep", action="store_true", - help="Deep scan: parse source maps and extract WASM exports") - - -def execute(args, workspace): - """Deprecated: print warning, then delegate to binary-scan's handler. - - The delegation calls ``binary_scan.execute`` directly (same args - namespace) so the output is exactly what ``binary-scan`` produces — - no capability is lost. - """ - print("DEPRECATED: Use binary-scan instead", file=sys.stderr) - from commands import binary_scan - return binary_scan.execute(args, workspace) - - -register_command( - "artifact-scan", - "DEPRECATED: use binary-scan instead — Scan for compiled/built artifacts (reverse engineering mode)", - add_args, - execute, -) diff --git a/scripts/commands/ask.py b/scripts/commands/ask.py deleted file mode 100644 index 1b8b20bd..00000000 --- a/scripts/commands/ask.py +++ /dev/null @@ -1,631 +0,0 @@ -"""Ask command — Natural language query router with score-based matching.""" - -import os -import re -from typing import Dict, Any, List, Tuple - -from context_engine import get_symbol_context -from search_engine import search_symbols -from deadcode_engine import detect_dead_code -from secrets_engine import detect_secrets -from circular_engine import detect_circular -from apimap_engine import map_api_routes -from entrypoints_engine import map_entrypoints -from smell_engine import detect_smells -from complexity_engine import compute_complexity -from impact_engine import analyze_impact -from trace_engine import trace_symbol -from testmap_engine import map_test_coverage -from perfhint_engine import detect_perf_hints -from vulnscan_engine import scan_vulnerabilities -from outline_engine import get_workspace_outline -from envcheck_engine import check_env_vars -from debugleak_engine import detect_debug_leaks -from statemap_engine import map_state -from dependents_engine import get_dependency_graph -from commands import register_command -from commands.scan import cmd_scan -from commands.handbook import cmd_handbook - -# ─── Keyword Weight Definitions ────────────────────────────────── -# Technical/specific terms get weight 3, action words get weight 1, -# generic filler words get weight 0. - -_KEYWORD_WEIGHTS: Dict[str, int] = { - # Technical terms (weight 3) — high specificity - "api route": 3, "api routes": 3, "endpoint": 3, "endpoints": 3, - "api map": 3, "rest route": 3, "http route": 3, "graphql": 3, - "circular": 3, "circular dependency": 3, "circular dep": 3, "dependency cycle": 3, - "dead code": 3, "unused code": 3, "unreachable": 3, "zombie": 3, - "orphan": 3, "never called": 3, - "secret": 3, "api key": 3, "password": 3, "token leak": 3, - "cve": 3, "vuln": 3, "vulnerability": 3, "vulnerable": 3, - "security hole": 3, - "code smell": 3, "technical debt": 3, - "complexity": 3, "cyclomatic": 3, "cognitive complexity": 3, - "test coverage": 3, "untested": 3, "missing test": 3, "test map": 3, - "performance": 3, "n+1": 3, "memory leak": 3, "bottleneck": 3, - "entry point": 3, "entrypoint": 3, - "env var": 3, "environment variable": 3, ".env": 3, "missing env": 3, - "console.log": 3, "debugger": 3, "debug code": 3, - "zustand": 3, "redux": 3, "pinia": 3, "global state": 3, - "side effect": 3, "pure function": 3, "impure": 3, "side-effect": 3, - "refactor": 3, "safe to rename": 3, "safe to move": 3, - "dependents": 3, "import graph": 3, "dependency graph": 3, - "css issue": 3, "css problem": 3, "css audit": 3, - "accessibility": 3, "a11y": 3, "aria": 3, - "regex": 3, "redo": 3, "redos": 3, - "what changed": 3, "diff": 3, "changes": 3, - "tech stack": 3, "frameworks": 3, "detect framework": 3, - "how to configure": 3, "configuration": 3, - "not used": 3, - "compiled": 3, "binary": 3, "artifact": 3, "built": 3, "minified": 3, - "wasm": 3, "reverse engineer": 3, "reverse engineering": 3, - # IPC / frontend-backend communication terms (weight 3) - "ipc": 3, "tauri": 3, "invoke": 3, "bridge": 3, - "frontend backend": 3, "frontend-backend": 3, "cross-language": 3, - "ipc bridge": 3, "command handler": 3, "ipc handler": 3, - "frontend backend relationship": 3, "frontend backend communication": 3, - "rust frontend": 3, "native bridge": 3, "ipc channel": 3, - "command binding": 3, "exposed function": 3, "plugin command": 3, - # Individual frontend/backend/rust keywords (weight 2) — helps when words are non-adjacent - "frontend": 2, "backend": 2, "rust": 2, "communicate": 2, - "native layer": 2, "sidecar": 2, "invoke command": 2, - - # Summary/overview terms (v6.2) - "summary": 3, "overview": 3, "quick summary": 3, "brief": 3, - "tldr": 3, "tl;dr": 3, "give me a summary": 3, - "what matters": 3, "what's important": 3, "priority": 3, - "auto summary": 3, "auto detect": 3, - - # Architecture/module terms (v6.4) - "module": 3, "modules": 3, "main modules": 3, "architecture": 3, - "codebase structure": 3, "code organization": 3, - "component structure": 3, "project layout": 3, - "project structure": 3, "what are the main": 3, - - # Action words (weight 1) — lower specificity - "show me": 1, "find": 1, "search for": 1, "look for": 1, - "trace": 1, "scan": 1, "analyze": 1, "index": 1, - "where is": 1, "where's": 1, "where does": 1, - "what is": 1, "what's": 1, "how does": 1, "how is": 1, - "who imports": 1, "who uses": 1, "who depends": 1, - "find definition": 1, "find def": 1, "find all": 1, "find symbol": 1, - - # Generic words (weight 0) — ignored for scoring - "the": 0, "a": 0, "an": 0, "me": 0, "my": 0, - "this": 0, "that": 0, "it": 0, "is": 0, "are": 0, - "of": 0, "for": 0, "in": 0, "on": 0, "to": 0, -} - -# Default weight for keywords not in the table -_DEFAULT_KEYWORD_WEIGHT = 2 - - -def _get_keyword_weight(kw: str) -> int: - """Get the weight for a keyword based on specificity.""" - return _KEYWORD_WEIGHTS.get(kw, _DEFAULT_KEYWORD_WEIGHT) - - -def add_args(parser): - parser.add_argument("question", help="Natural language question about the codebase") - parser.add_argument("workspace", nargs="?", default=None, - help="Path to workspace root (auto-detected if omitted)") - - -def execute(args, workspace): - return cmd_ask(args.question, workspace) - - -def cmd_ask(question: str, workspace: str) -> Dict[str, Any]: - """ - Natural language query router. - Maps a question to the appropriate CodeLens command and returns its result. - Uses score-based matching to find the best command. - """ - workspace = os.path.abspath(workspace) - q = question.lower().strip() - - # Determine which command to run based on score-based matching - command, args = _parse_ask_question(q, workspace) - - if command is None: - return { - "status": "unknown_query", - "question": question, - "workspace": workspace, - "suggestion": "Could not determine the appropriate command. Try: scan, context, trace, impact, smell, dead-code, secrets, circular, api-map, entrypoints, outline, query, complexity, test-map, perf-hint, vuln-scan, dependents, refactor-safe, css-deep, a11y, regex-audit, diff, detect, env-check" - } - - # Execute the determined command - try: - result = _execute_ask_command(command, args, workspace) - except Exception as e: - return { - "status": "error", - "question": question, - "interpreted_as": command, - "error": str(e) - } - - # Add interpretation metadata - if isinstance(result, dict): - result["query_interpretation"] = { - "question": question, - "interpreted_as": command, - "confidence": args.pop("_confidence", "medium") - } - - # Fallback: if context returned not_found, try symbol search, then code search - if (command == "context" and isinstance(result, dict) - and result.get("status") == "not_found"): - symbol = args.get("name", "") - if symbol: - # First try: fuzzy symbol search in registry - search_result = search_symbols(workspace, symbol, domain="all", fuzzy=True) - if search_result.get("results"): - search_result["query_interpretation"] = { - "question": question, - "interpreted_as": "symbol search (fallback from context)", - "confidence": "low" - } - return search_result - # Second try: full-text code search (catches conceptual/keyword queries) - try: - from search_engine import search_workspace - code_result = search_workspace(workspace, symbol, max_results=15, case_sensitive=False, whole_word=False) - # search_workspace uses 'matches' key, not 'results' - code_matches = code_result.get("matches", []) - if code_matches: - code_result["query_interpretation"] = { - "question": question, - "interpreted_as": "code search (fallback from context+symbols)", - "confidence": "low" - } - return code_result - except Exception: - pass - - return result - - -def _parse_ask_question(q: str, workspace: str) -> tuple: - """ - Parse a natural language question using score-based matching. - - Instead of first-match, each pattern is scored based on: - 1. Keyword weight (specific technical terms = 3, action words = 1, generic = 0) - 2. Number of keywords matched from the pattern - 3. Coverage bonus for matching multiple keywords from same pattern - - The highest-scoring pattern wins, which correctly routes queries like - "show me the API routes" to api-map (score from "api route" = 3*2*1.5 = 9) - instead of context (score from "show me" = 1*1*1.0 = 1). - """ - - patterns = [ - # ─── Specific topic patterns ─────────── - - # Dead code - (["dead code", "unused code", "unreachable", "zombie", "not used", "never called", "orphan"], - "dead-code", {}, "high"), - - # API routes - (["api route", "api routes", "endpoint", "endpoints", "api map", "rest route", "http route", "graphql"], - "api-map", {}, "high"), - - # Circular dependencies - (["circular", "cycle", "circular dependency", "circular dep", "dependency cycle"], - "circular", {}, "high"), - - # Entrypoints - (["entry point", "entrypoint", "main function", "where does it start", "how does it start", "boot"], - "entrypoints", {}, "high"), - - # Security - (["security", "secret", "api key", "password", "token leak", "cve", "vuln"], - "secrets", {}, "high"), - - # Vulnerabilities - (["vulnerability", "vulnerable", "security hole"], - "vuln-scan", {}, "high"), - - # Smells / health - (["code smell", "smell", "health", "code quality", "code health", "technical debt"], - "smell", {}, "high"), - - # Complexity - (["complexity", "complex", "complicated", "cyclomatic", "cognitive complexity"], - "complexity", {}, "high"), - - # Test coverage - (["test coverage", "tested", "untested", "missing test", "test map"], - "test-map", {}, "high"), - - # Performance - (["performance", "slow", "perf", "n+1", "memory leak", "bottleneck"], - "perf-hint", {}, "high"), - - # Impact analysis - (["what happens if", "impact of", "what if i change", "what if i delete", "can i change", "can i delete", "safe to"], - "impact", {"name": _extract_symbol_name, "action": "modify"}, "medium"), - - # Outline - (["outline", "structure", "file structure", "what's in", "contents of"], - "outline", {}, "medium"), - - # Environment check - (["env var", "environment variable", ".env", "missing env", "env check"], - "env-check", {}, "high"), - - # Debug leak - (["debug code", "console.log", "debugger", "todo", "fixme", "leftover"], - "debug-leak", {}, "high"), - - # State management - (["state", "store", "zustand", "redux", "pinia", "global state"], - "state-map", {}, "high"), - - # Side effects - (["side effect", "pure function", "impure", "mutation", "side-effect"], - "side-effect", {"name": _extract_symbol_name}, "high"), - - # Refactor safety - (["refactor", "rename", "move", "safe to rename", "safe to move"], - "refactor-safe", {"name": _extract_symbol_name}, "medium"), - - # ─── Newly added patterns ─────────── - - # Dependents / import tracking - (["dependents", "who imports", "who uses", "who depends", "import graph", "dependency graph", - "which files import"], - "dependents", {}, "medium"), - - # Refactor safety (broader) - (["is this code safe", "safe to change", "safe to remove", "is it safe"], - "refactor-safe", {"name": _extract_symbol_name}, "medium"), - - # CSS deep analysis - (["css issue", "css problem", "css audit", "css analysis", "css deep", - "css variable", "keyframe", "specificity", "z-index"], - "css-deep", {}, "high"), - - # Accessibility - (["accessibility", "a11y", "aria", "wcag", "screen reader", "alt text", - "keyboard nav", "focus", "accessible"], - "a11y", {}, "high"), - - # Regex audit - (["regex", "regexp", "regular expression", "redo", "redos", "regex audit", - "catastrophic backtracking", "regex vulnerabilities", "regex vulnerability", - "regex issue", "regex problem"], - "regex-audit", {}, "high"), - - # Reverse engineering / artifact detection - (["compiled", "binary", "artifact", "built", "minified", "wasm", - "reverse engineer", "reverse engineering", "dist folder", "build output", - "compiled artifacts", "built files", "minified files"], - "artifact-scan", {}, "high"), - - # Diff / changes - (["what changed", "diff", "changes since", "what's different", "compare"], - "diff", {}, "high"), - - # Detect / tech stack - (["tech stack", "frameworks", "detect framework", "what framework", "what libraries", - "what technologies", "stack"], - "detect", {}, "high"), - - # Env configuration - (["how to configure", "configuration", "config check", "env setup"], - "env-check", {}, "high"), - - # ─── IPC / Frontend-Backend / Tauri patterns ─────────── - - # IPC communication / frontend-backend bridge - (["ipc", "invoke", "bridge", "cross-language", "ipc bridge", "ipc handler", - "command handler", "ipc channel", "command binding", "exposed function", - "plugin command", "native bridge", - "frontend backend", "frontend-backend", "frontend backend relationship", - "frontend backend communication", "rust frontend", - "frontend", "backend", "communicate", "sidecar", "native layer", - "invoke command"], - "api-map", {}, "high"), - - # How does the frontend talk to the backend / Tauri invoke patterns - (["tauri", "tauri ipc", "tauri command", "tauri invoke", "tauri bridge"], - "api-map", {}, "high"), - - # ─── Generic patterns (scored lower by keyword weight) ──── - - # Architecture / module structure (v6.4 — was misrouted to code search) - (["module", "modules", "main modules", "architecture", "codebase structure", - "how is the code organized", "code organization", "component structure", - "project layout", "project structure", "what are the main"], - "handbook", {}, "high"), - - # Context / definition queries - (["where is", "where's", "where does", "find definition", "find def", "what is", "what's"], - "context", {"name": _extract_symbol_name}, "high"), - - # Symbol search - (["search for", "find symbol", "find all", "look for"], - "symbols", {"name": _extract_symbol_name}, "high"), - - # Trace - (["how does", "trace", "call chain", "call path", "how is", "connected to", "flows to", "flow from"], - "trace", {"name": _extract_symbol_name, "direction": "both"}, "medium"), - - # Show me (generic — low weight keywords) - (["show me"], - "context", {"name": _extract_symbol_name}, "low"), - - # Scan - (["scan", "analyze", "index", "build registry", "full analysis"], - "scan", {}, "high"), - - # Handbook - (["overview", "handbook", "project brief", "tell me about", "summarize", "summary of"], - "handbook", {}, "high"), - - # Summary (v6.2 — anti-overload condensed view) - (["quick summary", "tldr", "tl;dr", "give me a summary", "what matters", - "what's important", "auto summary", "auto detect", "priority issues"], - "summary", {}, "high"), - ] - - # ─── Score each pattern ──────────────────────────────── - candidates: List[Tuple[float, str, dict, str]] = [] - - for keywords, command, extra_args, confidence in patterns: - score = 0.0 - matched_keywords = 0 - max_weight_in_pattern = 0.0 - - for kw in keywords: - if kw in q: - weight = _get_keyword_weight(kw) - # Multi-word keywords are more specific: "api route" > "api" - word_bonus = len(kw.split()) - score += weight * word_bonus - matched_keywords += 1 - max_weight_in_pattern = max(max_weight_in_pattern, weight) - - if matched_keywords > 0: - # Coverage bonus: matching more keywords from same pattern is better - coverage = matched_keywords / len(keywords) - score *= (1 + coverage) - - # Specificity bonus: patterns with high-weight keyword matches - # (weight 3 = technical terms like "architecture", "dead code", "api route") - # should beat generic patterns (weight 1 = "show me") even when coverage is low. - # Without this, "show me the architecture" → context (4.0) > handbook (3.27) - if max_weight_in_pattern >= 3: - score *= 1.5 - - candidates.append((score, command, extra_args, confidence)) - - if not candidates: - # Fallback: try to find a symbol name and use context - symbol = _extract_symbol_name(q, "") - if symbol: - return "context", {"name": symbol, "_confidence": "low"} - return None, {} - - # Sort by score descending, return best match - candidates.sort(key=lambda x: x[0], reverse=True) - best_score, command, extra_args, confidence = candidates[0] - - # Adjust confidence based on score margin - if len(candidates) > 1 and candidates[0][0] <= candidates[1][0] * 1.2: - # Close match — lower confidence - if confidence == "high": - confidence = "medium" - - # Build args dict - resolved_args = {"_confidence": confidence, "_score": best_score} - for key, val in extra_args.items(): - if callable(val): - resolved_args[key] = val(q, "") - else: - resolved_args[key] = val - - return command, resolved_args - - -def _extract_symbol_name(q: str, keyword: str) -> str: - """Try to extract a symbol name from the question.""" - # Remove common question words - cleaned = q - for prefix in ["where is ", "where's ", "where does ", "what is ", "what's ", - "what does ", "what do ", "why does ", "why do ", - "when does ", "when do ", "how does ", "how is ", - "how can ", "how should ", - "show me ", "find definition of ", "find def ", "find ", - "search for ", "trace ", "impact of ", - "what happens if i change ", "what happens if i delete ", - "can i change ", "can i delete ", "is this code safe ", - "is it safe ", "safe to change ", "safe to remove ", - "which files import "]: - if cleaned.startswith(prefix): - cleaned = cleaned[len(prefix):] - break - - # Remove trailing question marks and whitespace - cleaned = cleaned.rstrip("?!. ").strip() - - # Remove common English filler words and type keywords - # Include pronouns like "i ", "we " that appear after question prefixes - # e.g. "how can i find..." → strip "how can " → "i find..." → strip "i " → "find..." - for filler in ["the ", "a ", "an ", "i ", "we ", "you ", "they ", "my ", "our ", - "this ", "that ", "these ", "those ", - "function ", "class ", "method ", "variable ", "const ", - "module ", "file ", "component ", "hook ", "type ", - "interface ", "enum "]: - cleaned = re.sub(r'^' + re.escape(filler), '', cleaned, flags=re.IGNORECASE) - cleaned = re.sub(re.escape(filler.rstrip()) + r'$', '', cleaned, flags=re.IGNORECASE) - cleaned = cleaned.strip() - - # Try to extract code-like identifiers - match = re.search(r'`([^`]+)`', q) - if match: - return match.group(1).strip() - - # Look for quoted names - match = re.search(r'["\']([^"\']+)["\']', q) - if match: - return match.group(1).strip() - - # Look for identifier-like patterns - match = re.search(r'[a-z][a-zA-Z0-9]*_[a-zA-Z0-9_]+', cleaned) - if match: - return match.group(0) - match = re.search(r'[a-z][a-zA-Z0-9]*[A-Z][a-zA-Z0-9]*', cleaned) - if match: - return match.group(0) - match = re.search(r'[A-Z][a-zA-Z0-9]*(?:[A-Z][a-zA-Z0-9]*)+', cleaned) - if match: - return match.group(0) - - # Fallback: any identifier - match = re.search(r'[a-zA-Z_][a-zA-Z0-9_.]*', cleaned) - if match: - return match.group(0) - - return cleaned if cleaned else "" - - -def _execute_ask_command(command: str, args: dict, workspace: str) -> Dict[str, Any]: - """Execute the determined command with the given args. - - Includes a timeout mechanism for expensive commands to prevent - the ask router from hanging on large codebases. - """ - import signal - import time - - # Commands that can be slow on large repos — apply timeout - _SLOW_COMMANDS = {"dead-code", "smell", "complexity", "scan", "handbook", - "outline", "test-map", "perf-hint", "css-deep", "a11y", - "summary"} - _ASK_TIMEOUT = 45 # seconds — increased from 30 for large repos - - def _timeout_handler(signum, frame): - raise TimeoutError(f"ask command '{command}' timed out after {_ASK_TIMEOUT}s") - - start_time = time.time() - timed_out = False - - if command in _SLOW_COMMANDS: - # Try with timeout on Unix systems - try: - old_handler = signal.signal(signal.SIGALRM, _timeout_handler) - signal.alarm(_ASK_TIMEOUT) - except (AttributeError, ValueError): - # Windows or non-main thread — no SIGALRM, run without timeout - old_handler = None - - try: - result = _dispatch_command(command, args, workspace) - except TimeoutError: - timed_out = True - result = { - "status": "timeout", - "message": f"Command '{command}' timed out after {_ASK_TIMEOUT}s. " - f"Run it directly for full results: codelens {command}", - "command": command, - } - except Exception as e: - result = {"status": "error", "message": str(e), "command": command} - finally: - if command in _SLOW_COMMANDS: - try: - signal.alarm(0) # Cancel alarm - if old_handler is not None: - signal.signal(signal.SIGALRM, old_handler) - except (AttributeError, ValueError): - pass - - # Add timing info - if isinstance(result, dict): - result["ask_duration_ms"] = int((time.time() - start_time) * 1000) - - return result - - -def _dispatch_command(command: str, args: dict, workspace: str) -> Dict[str, Any]: - """Dispatch to the actual command implementation.""" - if command == "context": - return get_symbol_context(args.get("name", ""), workspace) - elif command == "symbols": - return search_symbols(workspace, args.get("name", ""), domain="all", fuzzy=True) - elif command == "dead-code": - return detect_dead_code(workspace) - elif command == "secrets": - return detect_secrets(workspace) - elif command == "circular": - return detect_circular(workspace) - elif command == "api-map": - return map_api_routes(workspace) - elif command == "entrypoints": - return map_entrypoints(workspace) - elif command == "smell": - return detect_smells(workspace) - elif command == "complexity": - return compute_complexity(workspace, sort_by="complexity", limit=30) - elif command == "impact": - return analyze_impact(args.get("name", ""), workspace, action=args.get("action", "modify")) - elif command == "trace": - return trace_symbol(args.get("name", ""), workspace, direction=args.get("direction", "both")) - elif command == "test-map": - return map_test_coverage(workspace) - elif command == "perf-hint": - return detect_perf_hints(workspace) - elif command == "vuln-scan": - return scan_vulnerabilities(workspace) - elif command == "outline": - return get_workspace_outline(workspace) - elif command == "env-check": - return check_env_vars(workspace) - elif command == "debug-leak": - return detect_debug_leaks(workspace) - elif command == "state-map": - return map_state(workspace) - elif command == "scan": - return cmd_scan(workspace) - elif command == "handbook": - return cmd_handbook(workspace) - elif command == "summary": - from commands.summary import generate_summary - return generate_summary(workspace) - elif command == "dependents": - return get_dependency_graph(workspace) - elif command == "css-deep": - from cssdeep_engine import analyze_css_deep - return analyze_css_deep(workspace) - elif command == "a11y": - from a11y_engine import audit_accessibility - return audit_accessibility(workspace) - elif command == "regex-audit": - from regexaudit_engine import audit_regex_patterns - return audit_regex_patterns(workspace) - elif command == "diff": - from diff_engine import diff_current_vs_last - return diff_current_vs_last(workspace) - elif command == "detect": - from framework_detect import detect_frameworks - return detect_frameworks(workspace) - elif command == "refactor-safe": - from refactor_safe_engine import check_refactor_safety - return check_refactor_safety(args.get("name", ""), workspace) - elif command == "artifact-scan": - # artifact-scan is now a deprecated alias for binary-scan (issue #98). - # Delegate to the merged handler so the ask command stays consistent - # with the CLI surface. - from commands.binary_scan import cmd_binary_scan - return cmd_binary_scan(workspace, deep=False) - else: - return {"status": "error", "message": f"Unknown command: {command}"} - - -register_command("ask", "Ask a question in natural language", add_args, execute) diff --git a/scripts/commands/audit.py b/scripts/commands/audit.py new file mode 100644 index 00000000..6f498d7d --- /dev/null +++ b/scripts/commands/audit.py @@ -0,0 +1,223 @@ +"""audit command — code-quality checks (issue #195 consolidation). + +Umbrella command that absorbs: + - dead-code Enhanced dead code detection + - complexity Cyclomatic/cognitive complexity + - smell Code smells across workspace + - staleness Per-file staleness detection + - perf-hint Performance anti-patterns + - side-effect Pure vs impure function analysis + +(god-module from the issue mapping is part of arch-metrics, exposed via +``summary --check arch-metrics`` — there is no standalone god-module command +to absorb here.) + +Usage: + codelens audit # all checks + codelens audit --check dead-code # only dead-code + codelens audit --check complexity,smell # pick subset + +Output: ``{"s":"ok", "st":{...}, "r":[...]}`` — one entry per check under +``r`` and aggregate counts under ``st``. +""" + +# @WHO: scripts/commands/audit.py +# @WHAT: Umbrella command for code-quality audits. +# @PART: commands +# @ENTRY: execute() + +import argparse +import importlib +import sys +from typing import Any, Dict, List + +from commands import register_command + + +_CHECKS = { + "dead-code": { + "module": "commands.dead_code", + "help": "Enhanced dead code detection", + }, + "complexity": { + "module": "commands.complexity", + "help": "Cyclomatic/cognitive complexity", + }, + "smell": { + "module": "commands.smell", + "help": "Code smells across workspace", + }, + "staleness": { + "module": "commands.staleness", + "help": "Per-file staleness detection", + }, + "perf-hint": { + "module": "commands.perf_hint", + "help": "Performance anti-patterns", + }, + "side-effect": { + "module": "commands.side_effect", + "help": "Pure vs impure function analysis", + }, +} + +ALL_CHECKS = list(_CHECKS.keys()) + + +def add_args(parser): + """Add audit-specific arguments to the parser.""" + parser.formatter_class = argparse.RawDescriptionHelpFormatter + parser.epilog = ( + "Sub-analyses (issue #195):\n" + " dead-code Enhanced dead code detection\n" + " complexity Cyclomatic/cognitive complexity\n" + " smell Code smells across workspace\n" + " staleness Per-file staleness detection\n" + " perf-hint Performance anti-patterns\n" + " side-effect Pure vs impure function analysis\n" + "\n" + "Examples:\n" + " codelens audit . # all checks\n" + " codelens audit . --check dead-code # only dead-code\n" + " codelens audit . --check complexity,smell # pick subset\n" + ) + parser.add_argument("workspace", nargs="?", default=None, + help="Path to workspace root (auto-detected if omitted)") + parser.add_argument("--check", default=None, + help=f"Comma-separated sub-analyses. " + f"Choices: {', '.join(ALL_CHECKS)}. Default: all.") + # Common passthroughs. + parser.add_argument("--max-files", type=int, default=None, + help="dead-code/smell/perf-hint/side-effect/complexity: file cap") + parser.add_argument("--max-results", type=int, default=None, + help="dead-code: max results per category") + parser.add_argument("--categories", nargs="+", default=None, + help="dead-code/smell: sub-category filter") + parser.add_argument("--severity", default=None, + help="smell/perf-hint: info|warning|critical (smell) or " + "critical|high|medium|low (perf-hint)") + parser.add_argument("--threshold", type=int, default=None, + help="complexity: minimum complexity threshold") + parser.add_argument("--sort", dest="sort_by", default=None, + help="complexity: complexity|cognitive|loc") + parser.add_argument("--name", default=None, + help="complexity/side-effect: function name filter") + parser.add_argument("--file", default=None, + help="complexity/side-effect: file path filter") + parser.add_argument("--limit", type=int, default=None, + help="staleness/complexity: result limit") + parser.add_argument("--category", default=None, + help="perf-hint: single category filter") + parser.add_argument("--no-confirm-hash", action="store_true", default=False, + help="staleness: skip content-hash confirmation") + + +def _parse_checks(check_arg: str) -> List[str]: + if not check_arg: + return list(ALL_CHECKS) + parts = [c.strip() for c in check_arg.split(",") if c.strip()] + invalid = [p for p in parts if p not in _CHECKS] + if invalid: + print( + f"[CodeLens] audit: unknown --check category '{','.join(invalid)}'. " + f"Valid: {', '.join(ALL_CHECKS)}", + file=sys.stderr, + ) + sys.exit(1) + return parts or list(ALL_CHECKS) + + +def _build_namespace(base_args, check_name: str) -> argparse.Namespace: + ns = argparse.Namespace() + for attr in ("format", "top", "max_tokens", "lite", "deep", "db_path", + "diff_base", "diff_scope", "disable_suppression", + "codelens_ignore_pattern"): + setattr(ns, attr, getattr(base_args, attr, None)) + ns.workspace = getattr(base_args, "workspace", None) + if check_name == "dead-code": + ns.categories = getattr(base_args, "categories", None) + ns.max_files = getattr(base_args, "max_files", None) or 3000 + ns.max_results = getattr(base_args, "max_results", None) or 100 + elif check_name == "complexity": + ns.name = getattr(base_args, "name", None) + ns.file = getattr(base_args, "file", None) + ns.threshold = getattr(base_args, "threshold", None) + ns.sort_by = getattr(base_args, "sort_by", None) + ns.limit = getattr(base_args, "limit", None) + ns.max_files = getattr(base_args, "max_files", None) or 5000 + elif check_name == "smell": + ns.categories = getattr(base_args, "categories", None) + ns.severity = getattr(base_args, "severity", None) + ns.max_files = getattr(base_args, "max_files", None) or 5000 + elif check_name == "staleness": + # staleness has its own --format text/json; force json here so the + # umbrella can merge it into the unified result shape. + ns.format = "json" + ns.no_confirm_hash = getattr(base_args, "no_confirm_hash", False) + ns.max_files = getattr(base_args, "max_files", None) or 10000 + ns.limit = getattr(base_args, "limit", None) or 10 + elif check_name == "perf-hint": + ns.severity = getattr(base_args, "severity", None) + ns.category = getattr(base_args, "category", None) + ns.max_files = getattr(base_args, "max_files", None) or 5000 + elif check_name == "side-effect": + ns.name = getattr(base_args, "name", None) + ns.file = getattr(base_args, "file", None) + ns.max_files = getattr(base_args, "max_files", None) or 3000 + return ns + + +def execute(args, workspace): + """Run one or more audit checks and merge results. + + @FLOW: AUDIT_DISPATCH + @CALLS: _parse_checks() -> List[str] + _build_namespace() -> argparse.Namespace + commands..execute() -> dict per sub + @MUTATES: nothing (read-only analyses) + """ + checks = _parse_checks(getattr(args, "check", None)) + results: List[Dict[str, Any]] = [] + stats: Dict[str, Any] = {"checks_run": 0, "checks_failed": 0} + + for check_name in checks: + spec = _CHECKS[check_name] + try: + mod = importlib.import_module(spec["module"]) + sub_args = _build_namespace(args, check_name) + sub_result = mod.execute(sub_args, workspace) + if not isinstance(sub_result, dict): + sub_result = {"status": "ok", "result": sub_result} + sub_result["_check"] = check_name + results.append(sub_result) + stats["checks_run"] += 1 + except Exception as exc: + stats["checks_failed"] += 1 + stats["checks_run"] += 1 + results.append({ + "_check": check_name, + "s": "error", + "error": str(exc), + "error_type": type(exc).__name__, + }) + print( + f"[CodeLens] audit: --check {check_name} failed: {exc}", + file=sys.stderr, + ) + + return { + "s": "ok" if stats["checks_failed"] == 0 else "partial", + "st": { + "checks_requested": len(checks), + **stats, + }, + "r": results, + } + + +register_command( + "audit", + "Code-quality audits: dead-code / complexity / smell / staleness / perf-hint / side-effect", + add_args, + execute, +) diff --git a/scripts/commands/benchmark.py b/scripts/commands/benchmark.py deleted file mode 100644 index 431f7335..00000000 --- a/scripts/commands/benchmark.py +++ /dev/null @@ -1,98 +0,0 @@ -"""Benchmark command — Run accuracy and performance benchmarks against fixtures.""" - -import os -import sys -import json -from typing import Dict, Any - -from commands import register_command - -BENCHMARKS_DIR = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "benchmarks") - - -def add_args(parser): - parser.add_argument("workspace", nargs="?", default=None, - help="Path to workspace root (auto-detected if omitted)") - parser.add_argument("--quick", action="store_true", - help="Run quick subset (4 commands only)") - parser.add_argument("--fixture", type=str, default=None, - help="Run benchmarks for a specific fixture only") - parser.add_argument("--compare", type=str, default=None, - help="Compare results against a baseline JSON file") - parser.add_argument("--output", "-o", type=str, default=None, - help="Save results to a specific JSON file") - parser.add_argument("--update-snapshot", action="store_true", - help="Save results as new regression baseline") - - -def execute(args, workspace): - """Execute the benchmark suite and return AI-friendly results.""" - if BENCHMARKS_DIR not in sys.path: - sys.path.insert(0, BENCHMARKS_DIR) - - try: - from run_benchmarks import run_benchmark_suite - except ImportError: - import importlib.util - spec = importlib.util.spec_from_file_location( - "run_benchmarks", os.path.join(BENCHMARKS_DIR, "run_benchmarks.py")) - if spec and spec.loader: - mod = importlib.util.module_from_spec(spec) - spec.loader.exec_module(mod) - run_benchmark_suite = mod.run_benchmark_suite - else: - return {"status": "error", "error": "Could not import benchmark runner", - "error_type": "import_error"} - - results = run_benchmark_suite( - fixture_name=args.fixture, quick=args.quick, - output_file=args.output, compare_file=args.compare, - ) - - summary = results.get("summary", {}) - items = [] - for fn, fd in results.get("fixtures", {}).items(): - for cn, cd in fd.get("commands", {}).items(): - items.append({ - "fixture": fn, "command": cn, - "description": cd.get("description", ""), - "f1": cd.get("metrics", {}).get("f1", 0), - "precision": cd.get("metrics", {}).get("precision", 0), - "recall": cd.get("metrics", {}).get("recall", 0), - "fpr": cd.get("metrics", {}).get("fpr", 0), - "expected": cd.get("expected_count", 0), - "found": cd.get("found_count", 0), - "meets_target": cd.get("meets_target", False), - "beats_competitor": cd.get("beats_competitor", False), - "elapsed_seconds": cd.get("elapsed_seconds", 0), - }) - - result = { - "status": "ok", - "command": "benchmark", - "stats": { - "avg_f1": summary.get("avg_f1", 0), - "avg_precision": summary.get("avg_precision", 0), - "avg_recall": summary.get("avg_recall", 0), - "avg_fpr_clean": summary.get("avg_fpr_clean", 0), - "meets_target_pct": summary.get("meets_target_pct", 0), - "beats_competitor_pct": summary.get("beats_competitor_pct", 0), - "total_commands": summary.get("total_commands_run", 0), - }, - "items": items, - "token_efficiency": results.get("token_efficiency", {}), - } - - if getattr(args, 'update_snapshot', False): - try: - if BENCHMARKS_DIR not in sys.path: - sys.path.insert(0, BENCHMARKS_DIR) - from check_regression import save_snapshot - save_snapshot(results) - except Exception: - pass - - return result - - -register_command("benchmark", "Run accuracy and performance benchmarks", add_args, execute) diff --git a/scripts/commands/binary_scan.py b/scripts/commands/binary_scan.py index c9d90d08..cbf92d7a 100644 --- a/scripts/commands/binary_scan.py +++ b/scripts/commands/binary_scan.py @@ -853,5 +853,7 @@ def _generate_recommendations( "binary-scan", "Scan for binary/compiled artifacts with reverse-engineering analysis (superset of artifact-scan)", add_args, - execute + execute, + hidden=True, + deprecated_alias_for='security', ) diff --git a/scripts/commands/check.py b/scripts/commands/check.py index 59aba83e..6b2448e7 100755 --- a/scripts/commands/check.py +++ b/scripts/commands/check.py @@ -486,4 +486,5 @@ def execute(args, workspace): '--strict/--error, --baseline-commit, --diff-scan)', add_args, execute, + hidden=True, ) diff --git a/scripts/commands/circular.py b/scripts/commands/circular.py index be788408..46d50dbc 100644 --- a/scripts/commands/circular.py +++ b/scripts/commands/circular.py @@ -17,4 +17,10 @@ def execute(args, workspace): return detect_circular(workspace, domain=args.domain, max_cycles=args.max_cycles) -register_command("circular", "Detect circular dependencies", add_args, execute) +register_command("circular", "Detect circular dependencies", add_args, execute, + +hidden=True, + +deprecated_alias_for='deps', + +) diff --git a/scripts/commands/complexity.py b/scripts/commands/complexity.py index c364aea3..5660a551 100644 --- a/scripts/commands/complexity.py +++ b/scripts/commands/complexity.py @@ -27,4 +27,10 @@ def execute(args, workspace): max_files=args.max_files) -register_command("complexity", "Compute cyclomatic/cognitive complexity", add_args, execute) +register_command("complexity", "Compute cyclomatic/cognitive complexity", add_args, execute, + +hidden=True, + +deprecated_alias_for='audit', + +) diff --git a/scripts/commands/config_drift.py b/scripts/commands/config_drift.py index 009cf67d..4e4772dd 100644 --- a/scripts/commands/config_drift.py +++ b/scripts/commands/config_drift.py @@ -13,4 +13,8 @@ def execute(args, workspace): return detect_config_drift(workspace) -register_command("config-drift", "Detect dependency drift (package.json vs code)", add_args, execute) +register_command("config-drift", "Detect dependency drift (package.json vs code)", add_args, execute, + +hidden=True, + +) diff --git a/scripts/commands/context.py b/scripts/commands/context.py index a81ed958..94170cb4 100644 --- a/scripts/commands/context.py +++ b/scripts/commands/context.py @@ -1,84 +1,208 @@ -"""context command — DEPRECATED alias for ``query``. - -The ``context `` command returned a subset of what -``query `` already provides (code + callers + callees + quality -metrics). It added no new information, so it was redundant (issue #99). - -``context`` still works for backward compatibility but prints a -deprecation warning to stderr and delegates to ``query``'s handler with -the same args. It will be removed in a future release — switch to -``codelens query``. +"""context command — symbol & codebase context (issue #195 consolidation). + +Umbrella command that absorbs: + - context (rich symbol context — legacy, was a query subset) + - outline (file structure outline) + - trace (deep call chain from a symbol) + - orient (10-second codebase orientation brief) + +Usage: + codelens context # orient (default) + codelens context --check orient # explicit orient + codelens context --check outline --file src/app.ts + codelens context --check trace --name handleAuth + codelens context --check context --name handleAuth + +When --check is omitted, defaults to ``orient`` (the broadest useful default +for "give me context on this codebase"). Pass ``--name`` for symbol-specific +checks (trace, context). + +Output: ``{"s":"ok", "st":{...}, "r":[...]}``. """ +# @WHO: scripts/commands/context.py +# @WHAT: Umbrella command for codebase/symbol context. +# @PART: commands +# @ENTRY: execute() + +import argparse +import importlib import sys +from typing import Any, Dict, List from commands import register_command -# Deprecation notice — printed once per invocation to stderr (NOT stdout, -# which is reserved for JSON/machine-readable output). Surfaced in both -# interactive and CI usage so users notice and migrate before the alias -# is removed. -_DEPRECATION_WARNING = ( - "DEPRECATED: codelens context is renamed to codelens query. " - "Use query instead.\n" -) +_CHECKS = { + "context": { + "module": "commands.query", # legacy context delegated to query + "help": "Rich symbol context (callers, callees, metrics)", + }, + "outline": { + "module": "commands.outline", + "help": "File structure outline", + }, + "trace": { + "module": "commands.trace", + "help": "Deep call chain from a symbol", + }, + "orient": { + "module": "commands.orient", + "help": "10-second codebase orientation brief", + }, +} + +ALL_CHECKS = list(_CHECKS.keys()) -def add_args(parser): - """Register context (deprecated alias) arguments. - Kept compatible with the legacy ``context`` interface so existing - scripts keep parsing. Flags that ``query`` does not understand - (``--context-lines``, ``--no-code``) are accepted but ignored — - output now comes from ``query``. - """ - parser.add_argument("name", help="Symbol name") +def add_args(parser): + """Add context-specific arguments to the parser.""" + parser.formatter_class = argparse.RawDescriptionHelpFormatter + parser.epilog = ( + "Sub-analyses (issue #195):\n" + " context Rich symbol context (callers, callees, metrics)\n" + " outline File structure outline\n" + " trace Deep call chain from a symbol\n" + " orient 10-second codebase orientation brief\n" + "\n" + "Examples:\n" + " codelens context . # orient (default)\n" + " codelens context . --check outline --file src/app.ts\n" + " codelens context . --check trace --name handleAuth\n" + " codelens context . --check context --name handleAuth\n" + ) parser.add_argument("workspace", nargs="?", default=None, help="Path to workspace root (auto-detected if omitted)") - parser.add_argument("--domain", choices=["frontend", "backend", "auto"], default="auto", - help="Domain") - parser.add_argument("--context-lines", type=int, default=5, - help="Lines of code context around symbol (default 5)") - parser.add_argument("--no-code", action="store_true", help="Skip source code in output") + parser.add_argument("--check", default=None, + help=f"Comma-separated sub-analyses. " + f"Choices: {', '.join(ALL_CHECKS)}. Default: orient.") + parser.add_argument("--name", default=None, + help="context/trace: symbol name to analyze") + parser.add_argument("--file", default=None, + help="outline: specific file path") + parser.add_argument("--all", dest="all_files", action="store_true", default=False, + help="outline: outline all files") + parser.add_argument("--detail", default=None, + help="outline: minimal|normal|full") + parser.add_argument("--direction", default=None, + help="trace: up|down|both (default up)") + parser.add_argument("--depth", type=int, default=None, + help="trace: max call depth (default 10)") + parser.add_argument("--domain", default=None, + help="context/trace: frontend|backend|auto") + parser.add_argument("--top", type=int, default=None, metavar="N", + help="orient: top-N start-here files (default 8)") + parser.add_argument("--limit", type=int, default=None, + help="trace/outline: result limit") + parser.add_argument("--offset", type=int, default=0, + help="trace/outline: pagination offset") + + +def _parse_checks(check_arg: str) -> List[str]: + if not check_arg: + return ["orient"] # sensible default for "give me context on this codebase" + parts = [c.strip() for c in check_arg.split(",") if c.strip()] + invalid = [p for p in parts if p not in _CHECKS] + if invalid: + print( + f"[CodeLens] context: unknown --check category '{','.join(invalid)}'. " + f"Valid: {', '.join(ALL_CHECKS)}", + file=sys.stderr, + ) + sys.exit(1) + return parts or ["orient"] + + +def _build_namespace(base_args, check_name: str) -> argparse.Namespace: + ns = argparse.Namespace() + for attr in ("format", "top", "max_tokens", "lite", "deep", "db_path", + "diff_base", "diff_scope", "disable_suppression", + "codelens_ignore_pattern"): + setattr(ns, attr, getattr(base_args, attr, None)) + ns.workspace = getattr(base_args, "workspace", None) + if check_name == "context": + # context delegates to query.execute — needs name + file + domain. + ns.name = getattr(base_args, "name", None) + ns.file = getattr(base_args, "file", None) + domain = getattr(base_args, "domain", None) + ns.domain = None if domain == "auto" else domain + elif check_name == "outline": + ns.file = getattr(base_args, "file", None) + ns.detail = getattr(base_args, "detail", None) or "normal" + ns.all_files = getattr(base_args, "all_files", False) + ns.limit = getattr(base_args, "limit", None) or 20 + ns.offset = getattr(base_args, "offset", 0) + elif check_name == "trace": + ns.name = getattr(base_args, "name", None) + ns.direction = getattr(base_args, "direction", None) or "up" + ns.depth = getattr(base_args, "depth", None) or 10 + ns.domain = getattr(base_args, "domain", None) + ns.limit = getattr(base_args, "limit", None) or 20 + ns.offset = getattr(base_args, "offset", 0) + ns.max_results = 1000 + ns.use_graph = True + elif check_name == "orient": + # orient reads top via getattr; reuse the base value if set. + pass # ns.top already set above via carry-over + return ns def execute(args, workspace): - """Execute the deprecated context command. - - Prints a deprecation warning to stderr, then delegates to - ``query``'s handler with the same args. - - Args: - args: Parsed argparse namespace (``name``, ``workspace``, ...). - workspace: Resolved workspace root path. + """Run one or more context sub-analyses and merge results. - Returns: - Dict with the query result. + @FLOW: CONTEXT_DISPATCH + @CALLS: _parse_checks() -> List[str] + _build_namespace() -> argparse.Namespace + commands..execute() -> dict per sub + @MUTATES: nothing (read-only) """ - print(_DEPRECATION_WARNING, file=sys.stderr, end="") - - # Lazy import to avoid module-load-order coupling between command - # modules (commands/__init__.py auto-imports all of them in sorted - # order, and ``context`` sorts before ``query``). - from commands import query as query_cmd - - # Normalize args for query compatibility. ``query.execute`` reads - # ``args.file`` directly and treats ``args.domain`` of None as - # "search both domains". ``context`` allowed ``--domain auto`` - # (default) which query does not understand — map it to None. - # query uses getattr-with-default for ``all``/``limit``/``fuzzy`` so - # those are safe, but ``file`` must exist as an attribute. - if not hasattr(args, "file"): - args.file = None - if getattr(args, "domain", None) == "auto": - args.domain = None - - return query_cmd.execute(args, workspace) + checks = _parse_checks(getattr(args, "check", None)) + results: List[Dict[str, Any]] = [] + stats: Dict[str, Any] = {"checks_run": 0, "checks_failed": 0} + + for check_name in checks: + spec = _CHECKS[check_name] + try: + mod = importlib.import_module(spec["module"]) + sub_args = _build_namespace(args, check_name) + # Special case: orient has its own text-mode printing; force json + # so the umbrella can merge it. + if check_name == "orient": + if not getattr(sub_args, "format", None): + sub_args.format = "json" + sub_result = mod.execute(sub_args, workspace) + if not isinstance(sub_result, dict): + sub_result = {"status": "ok", "result": sub_result} + sub_result["_check"] = check_name + results.append(sub_result) + stats["checks_run"] += 1 + except Exception as exc: + stats["checks_failed"] += 1 + stats["checks_run"] += 1 + results.append({ + "_check": check_name, + "s": "error", + "error": str(exc), + "error_type": type(exc).__name__, + }) + print( + f"[CodeLens] context: --check {check_name} failed: {exc}", + file=sys.stderr, + ) + + return { + "s": "ok" if stats["checks_failed"] == 0 else "partial", + "st": { + "checks_requested": len(checks), + **stats, + }, + "r": results, + } register_command( "context", - "DEPRECATED — use `query` instead", + "Codebase & symbol context: orient (default) / outline / trace / context (issue #195)", add_args, execute, ) diff --git a/scripts/commands/css_deep.py b/scripts/commands/css_deep.py deleted file mode 100644 index f19d2d83..00000000 --- a/scripts/commands/css_deep.py +++ /dev/null @@ -1,20 +0,0 @@ -"""CSS-deep command — Deep CSS analysis (vars, keyframes, specificity).""" - -from cssdeep_engine import analyze_css_deep -from commands import register_command - - -def add_args(parser): - parser.add_argument("workspace", nargs="?", default=None, - help="Path to workspace root (auto-detected if omitted)") - parser.add_argument("--severity", choices=["high", "medium", "low"], default=None, - help="Filter by severity") - parser.add_argument("--category", default=None, - help="Filter by category (unused_vars, orphan_keyframes, specificity_wars, duplicate_props, unused_media, z_index_abuse)") - - -def execute(args, workspace): - return analyze_css_deep(workspace, severity=args.severity, category=args.category) - - -register_command("css-deep", "Deep CSS analysis (vars, keyframes, specificity)", add_args, execute) diff --git a/scripts/commands/dashboard.py b/scripts/commands/dashboard.py index c8b3aeea..b9eca789 100644 --- a/scripts/commands/dashboard.py +++ b/scripts/commands/dashboard.py @@ -86,4 +86,10 @@ def on_modified(self, event): return result -register_command("dashboard", "Generate HTML visualization dashboard", add_args, execute) +register_command("dashboard", "Generate HTML visualization dashboard", add_args, execute, + +hidden=True, + +deprecated_alias_for='summary', + +) diff --git a/scripts/commands/dataflow.py b/scripts/commands/dataflow.py index ea9f939e..7f2881c0 100644 --- a/scripts/commands/dataflow.py +++ b/scripts/commands/dataflow.py @@ -223,4 +223,7 @@ def _generate_actionable_items(result): register_command("dataflow", "Trace data flow source→sink with cross-file call graph analysis", - add_args, execute) + add_args, execute, + hidden=True, + deprecated_alias_for='impact', + ) diff --git a/scripts/commands/dead_code.py b/scripts/commands/dead_code.py index f43a0b2f..b4876bb8 100644 --- a/scripts/commands/dead_code.py +++ b/scripts/commands/dead_code.py @@ -85,4 +85,10 @@ def execute(args, workspace): return result -register_command("dead-code", "Enhanced dead code detection", add_args, execute) +register_command("dead-code", "Enhanced dead code detection", add_args, execute, + +hidden=True, + +deprecated_alias_for='audit', + +) diff --git a/scripts/commands/debug_leak.py b/scripts/commands/debug_leak.py deleted file mode 100644 index fe500b87..00000000 --- a/scripts/commands/debug_leak.py +++ /dev/null @@ -1,24 +0,0 @@ -"""Debug-leak command — Detect leftover debug code.""" - -from debugleak_engine import detect_debug_leaks -from commands import register_command - - -def add_args(parser): - parser.add_argument("workspace", nargs="?", default=None, - help="Path to workspace root (auto-detected if omitted)") - parser.add_argument("--category", choices=["console_log", "print_statement", "debugger", - "todo_fixme", "commented_code", "test_skip", "mock_data", "dev_only"], - default=None, help="Filter by leak category") - parser.add_argument("--max-files", type=int, default=None, - help="Max files to scan (default: 3000)") - - -def execute(args, workspace): - kwargs = {"category": args.category} - if args.max_files is not None: - kwargs["max_files"] = args.max_files - return detect_debug_leaks(workspace, **kwargs) - - -register_command("debug-leak", "Detect leftover debug code", add_args, execute) diff --git a/scripts/commands/dependents.py b/scripts/commands/dependents.py index ea35cced..e877538c 100644 --- a/scripts/commands/dependents.py +++ b/scripts/commands/dependents.py @@ -43,4 +43,10 @@ def execute(args, workspace): return get_dependents(file_path, workspace, depth=args.depth) -register_command("dependents", "Module-level import tracking", add_args, execute) +register_command("dependents", "Module-level import tracking", add_args, execute, + +hidden=True, + +deprecated_alias_for='deps', + +) diff --git a/scripts/commands/deps.py b/scripts/commands/deps.py new file mode 100644 index 00000000..21f6fa79 --- /dev/null +++ b/scripts/commands/deps.py @@ -0,0 +1,217 @@ +"""deps command — dependency graph intelligence (issue #195 consolidation). + +Umbrella command that absorbs: + - affected (which test files are affected by source changes) + - dependents (module-level import tracking) + - circular (circular dependency detection) + - import-snapshot (import .codelens.gz snapshot into the graph DB) + +Usage: + codelens deps # run all checks + codelens deps --check circular # only circular + codelens deps --check affected,dependents + codelens deps --check import-snapshot --input path.codelens.gz + codelens deps auth/foo.ts --check affected # symbol-aware mode + +Output (compact / json): ``{"s":"ok", "st":{...}, "r":[...]}`` shape with +one entry per requested check under ``r`` and aggregate stats under ``st``. +""" + +# @WHO: scripts/commands/deps.py +# @WHAT: Umbrella command for dependency-graph intelligence. +# @PART: commands +# @ENTRY: execute() + +import argparse +import os +import sys +from typing import Any, Dict, List + +from commands import register_command + + +# Map each --check category to (module_path, execute_attr, required_args). +# ``required_args`` is a function(args) -> dict of namespace attributes that +# MUST be present on the synthetic namespace before delegating. +_CHECKS = { + "affected": { + "module": "commands.affected", + "help": "Identify test files affected by source changes", + }, + "dependents": { + "module": "commands.dependents", + "help": "Module-level import tracking", + }, + "circular": { + "module": "commands.circular", + "help": "Detect circular dependencies", + }, + "import-snapshot": { + "module": "commands.import_snapshot", + "help": "Import a .codelens.gz snapshot into the graph DB", + }, +} + +ALL_CHECKS = list(_CHECKS.keys()) + + +def add_args(parser): + """Add deps-specific arguments to the parser.""" + parser.formatter_class = argparse.RawDescriptionHelpFormatter + parser.epilog = ( + "Sub-analyses (issue #195):\n" + " affected Test files affected by source changes\n" + " dependents Module-level import tracking\n" + " circular Circular dependency detection\n" + " import-snapshot Import .codelens.gz into graph DB\n" + "\n" + "Examples:\n" + " codelens deps . # all checks\n" + " codelens deps . --check circular # only circular\n" + " codelens deps . --check affected,dependents # pick subset\n" + " codelens deps . --check import-snapshot --input s.codelens.gz\n" + ) + parser.add_argument("workspace", nargs="?", default=None, + help="Path to workspace root (auto-detected if omitted)") + parser.add_argument("--check", default=None, + help=f"Comma-separated sub-analyses to run. " + f"Choices: {', '.join(ALL_CHECKS)}. " + f"Default: run all.") + # Sub-command specific flags — passed through to the delegated executor. + parser.add_argument("--files", nargs="*", default=None, + help="affected: source files to analyze") + parser.add_argument("--depth", type=int, default=None, + help="affected/dependents: traversal depth") + parser.add_argument("--filter", default=None, + help="affected: glob filter for test files") + parser.add_argument("--include-source", action="store_true", default=False, + help="affected: include source dependents in output") + parser.add_argument("--direction", default=None, + help="dependents: dependents|dependencies|graph") + parser.add_argument("--domain", default=None, + help="circular: backend|imports|css|all") + parser.add_argument("--max-cycles", type=int, default=None, + help="circular: max cycles per type") + parser.add_argument("--input", default=None, + help="import-snapshot: path to .codelens.gz file") + parser.add_argument("--merge", action="store_true", default=False, + help="import-snapshot: deduplicate with existing graph") + parser.add_argument("--db-path", default=None, + help="Custom SQLite database path") + + +def _parse_checks(check_arg: str) -> List[str]: + """Parse --check argument into a list of valid check names.""" + if not check_arg: + return list(ALL_CHECKS) + parts = [c.strip() for c in check_arg.split(",") if c.strip()] + invalid = [p for p in parts if p not in _CHECKS] + if invalid: + print( + f"[CodeLens] deps: unknown --check category '{','.join(invalid)}'. " + f"Valid: {', '.join(ALL_CHECKS)}", + file=sys.stderr, + ) + sys.exit(1) + return parts or list(ALL_CHECKS) + + +def _build_namespace(base_args, check_name: str) -> argparse.Namespace: + """Build a synthetic argparse.Namespace for the delegated sub-command. + + Only attributes the sub-command's ``add_args``/``execute`` reads are set; + everything else falls back to ``None``/``False`` via ``getattr`` in the + sub-command code. + """ + ns = argparse.Namespace() + # Carry over global flags the sub-commands may read. + for attr in ("format", "top", "max_tokens", "lite", "deep", "db_path", + "diff_base", "diff_scope", "disable_suppression", + "codelens_ignore_pattern"): + setattr(ns, attr, getattr(base_args, attr, None)) + # Workspace + check-specific passthroughs. + ns.workspace = getattr(base_args, "workspace", None) + if check_name == "affected": + ns.files = getattr(base_args, "files", None) or [] + ns.depth = getattr(base_args, "depth", None) or 5 + ns.filter = getattr(base_args, "filter", None) + ns.include_source = getattr(base_args, "include_source", False) + ns.as_json = True # always JSON; main formatter handles --format + ns.quiet = False + ns.stdin = False + elif check_name == "dependents": + ns.file = (getattr(base_args, "files", None) or [None])[0] + ns.direction = getattr(base_args, "direction", None) or "dependents" + ns.depth = getattr(base_args, "depth", None) or 3 + elif check_name == "circular": + ns.domain = getattr(base_args, "domain", None) or "all" + ns.max_cycles = getattr(base_args, "max_cycles", None) + if ns.max_cycles is None: + # Defer default to the engine — load lazily. + try: + from circular_engine import MAX_CYCLES_PER_TYPE + ns.max_cycles = MAX_CYCLES_PER_TYPE + except Exception: + ns.max_cycles = 50 + elif check_name == "import-snapshot": + ns.input = getattr(base_args, "input", None) + ns.merge = getattr(base_args, "merge", False) + ns.db_path = getattr(base_args, "db_path", None) + return ns + + +def execute(args, workspace): + """Run one or more dependency-graph checks and merge results. + + @FLOW: DEPS_DISPATCH + @CALLS: _parse_checks() -> List[str] + _build_namespace() -> argparse.Namespace + commands..execute() -> dict per sub + @MUTATES: graph DB (only import-snapshot writes) + """ + checks = _parse_checks(getattr(args, "check", None)) + results: List[Dict[str, Any]] = [] + stats: Dict[str, Any] = {"checks_run": 0, "checks_failed": 0} + + for check_name in checks: + spec = _CHECKS[check_name] + try: + import importlib + mod = importlib.import_module(spec["module"]) + sub_args = _build_namespace(args, check_name) + sub_result = mod.execute(sub_args, workspace) + if not isinstance(sub_result, dict): + sub_result = {"status": "ok", "result": sub_result} + sub_result["_check"] = check_name + results.append(sub_result) + stats["checks_run"] += 1 + except Exception as exc: + stats["checks_failed"] += 1 + stats["checks_run"] += 1 + results.append({ + "_check": check_name, + "s": "error", + "error": str(exc), + "error_type": type(exc).__name__, + }) + print( + f"[CodeLens] deps: --check {check_name} failed: {exc}", + file=sys.stderr, + ) + + return { + "s": "ok" if stats["checks_failed"] == 0 else "partial", + "st": { + "checks_requested": len(checks), + **stats, + }, + "r": results, + } + + +register_command( + "deps", + "Dependency-graph intelligence: affected / dependents / circular / import-snapshot", + add_args, + execute, +) diff --git a/scripts/commands/deps_audit.py b/scripts/commands/deps_audit.py index 97d87605..7868e823 100644 --- a/scripts/commands/deps_audit.py +++ b/scripts/commands/deps_audit.py @@ -49,4 +49,5 @@ def execute(args, workspace): "Scan dependencies for known CVEs via OSV.dev (PyPI/npm/crates.io)", add_args, execute, +hidden=True, ) diff --git a/scripts/commands/detect.py b/scripts/commands/detect.py deleted file mode 100644 index b94215ef..00000000 --- a/scripts/commands/detect.py +++ /dev/null @@ -1,29 +0,0 @@ -"""Detect command — Detect frameworks and show recommended config.""" - -import os -from typing import Dict, Any - -from framework_detect import detect_frameworks -from commands import register_command - - -def add_args(parser): - parser.add_argument("workspace", nargs="?", default=None, - help="Path to workspace root (auto-detected if omitted)") - - -def execute(args, workspace): - return cmd_detect(workspace) - - -def cmd_detect(workspace: str) -> Dict[str, Any]: - """Detect frameworks and show recommended config.""" - workspace = os.path.abspath(workspace) - result = detect_frameworks(workspace) - # Ensure status field is present - if isinstance(result, dict) and "status" not in result: - result["status"] = "ok" - return result - - -register_command("detect", "Detect frameworks in workspace", add_args, execute) diff --git a/scripts/commands/diff.py b/scripts/commands/diff.py index 14b2358a..6e7b25dd 100644 --- a/scripts/commands/diff.py +++ b/scripts/commands/diff.py @@ -204,5 +204,7 @@ def cmd_diff_git_aware(workspace: str) -> Dict[str, Any]: "Compare registry snapshots (--git-aware for git-diff delta + impact)", add_args, execute, +hidden=True, +deprecated_alias_for='impact', ) diff --git a/scripts/commands/doctor.py b/scripts/commands/doctor.py index 2d8073f6..00f3e78b 100644 --- a/scripts/commands/doctor.py +++ b/scripts/commands/doctor.py @@ -561,6 +561,9 @@ def add_args(parser): dispatcher auto-detects from cwd (same pattern as ``scan``, ``query``, etc.). doctor uses it to check ``.codelens/`` writability. + + Issue #195: ``--check`` dispatches to absorbed sub-commands + (env-check, lsp-status). Without --check, runs the legacy doctor audit. """ parser.add_argument( "workspace", @@ -568,31 +571,157 @@ def add_args(parser): default=None, help="Path to workspace root (auto-detected if omitted)", ) + parser.add_argument( + "--check", + default=None, + help="Issue #195: comma-separated sub-analyses. " + "Choices: doctor, env-check, lsp-status. Default: doctor.", + ) parser.add_argument( "--fix", action="store_true", default=False, - help="Auto-install missing Python deps via 'pip install --user'", + help="doctor: auto-install missing Python deps via 'pip install --user'", ) parser.add_argument( "--verbose", action="store_true", default=False, - help="Show resolved versions and install paths for every check", + help="doctor: show resolved versions and install paths for every check", ) parser.add_argument( "--format", choices=["text", "json"], default="text", - help="Output format (default: text). 'json' for CI parsing.", + help="doctor: output format (default: text). 'json' for CI parsing.", + ) + # env-check passthrough + parser.add_argument( + "--var", + dest="var_name", + default=None, + help="env-check: filter to a specific environment variable name", ) # The global --format flag from codelens.py also works; we honor # whichever the user set. The local one wins if both are present. +# Issue #195: sub-command dispatch table for the doctor umbrella. +_DOCTOR_SUBCOMMANDS = { + "doctor": None, # handled inline + "env-check": "commands.env_check", + "lsp-status": "commands.lsp_status", # kept as utility module +} + + +def _dispatch_subcommands(args, workspace, check_arg): + """Dispatch to one or more absorbed sub-commands per --check.""" + import importlib as _il + parts = [c.strip() for c in check_arg.split(",") if c.strip()] + invalid = [p for p in parts if p not in _DOCTOR_SUBCOMMANDS] + if invalid: + import sys as _sys + print( + f"[CodeLens] doctor: unknown --check category '{','.join(invalid)}'. " + f"Valid: {', '.join(_DOCTOR_SUBCOMMANDS.keys())}", + file=_sys.stderr, + ) + _sys.exit(1) + if not parts: + parts = ["doctor"] + + results = [] + checks_failed = 0 + for check_name in parts: + try: + if check_name == "doctor": + # Run the legacy doctor logic. Force JSON output so the + # umbrella can merge it into the unified result shape. + args.format = "json" + sub_result = _run_legacy_doctor(args, workspace) + else: + mod = _il.import_module(_DOCTOR_SUBCOMMANDS[check_name]) + sub_args = _build_subnamespace(args, check_name) + sub_result = mod.execute(sub_args, workspace) + if not isinstance(sub_result, dict): + sub_result = {"status": "ok", "result": sub_result} + sub_result["_check"] = check_name + results.append(sub_result) + except Exception as exc: + checks_failed += 1 + results.append({ + "_check": check_name, + "s": "error", + "error": str(exc), + "error_type": type(exc).__name__, + }) + import sys as _sys + print(f"[CodeLens] doctor: --check {check_name} failed: {exc}", + file=_sys.stderr) + + return { + "s": "ok" if checks_failed == 0 else "partial", + "st": {"checks_requested": len(parts), "checks_failed": checks_failed}, + "r": results, + } + + +def _build_subnamespace(base_args, check_name): + """Build a synthetic namespace for the dispatched sub-command.""" + import argparse as _ap + ns = _ap.Namespace() + for attr in ("format", "top", "max_tokens", "lite", "deep", "db_path", + "diff_base", "diff_scope", "disable_suppression", + "codelens_ignore_pattern"): + setattr(ns, attr, getattr(base_args, attr, None)) + ns.workspace = getattr(base_args, "workspace", None) + if check_name == "env-check": + ns.var_name = getattr(base_args, "var_name", None) + return ns + + +def _run_legacy_doctor(args, workspace): + """Run the original doctor.execute logic (issue #195: absorbed).""" + verbose = bool(getattr(args, "verbose", False)) + do_fix = bool(getattr(args, "fix", False)) + fmt = getattr(args, "format", None) + if fmt not in ("text", "json"): + fmt = "text" + checks = _run_all_checks(workspace) + fixes = [] + if do_fix: + fixes = _apply_fixes(checks, verbose) + checks = _run_all_checks(workspace) + overall, exit_code = _aggregate_status(checks) + summary = { + "ok": sum(1 for c in checks if c["status"] == "ok"), + "warning": sum(1 for c in checks if c["status"] == "warning"), + "critical": sum(1 for c in checks if c["status"] == "critical"), + "total": len(checks), + } + result = { + "status": overall, + "exit_code": exit_code, + "checks": checks, + "fixes": fixes, + "summary": summary, + "platform": { + "python": sys.version.split()[0], + "platform": platform.platform(), + "machine": platform.machine(), + "executable": sys.executable, + }, + "workspace": workspace, + } + return result + + def execute(args, workspace): """Run the environment audit, optionally apply fixes, return result dict. + Issue #195: when --check is set, dispatch to absorbed sub-commands + (env-check, lsp-status) and merge results into the umbrella shape. + The result always includes: * ``status`` — "ok" | "warning" | "critical" @@ -602,6 +731,11 @@ def execute(args, workspace): * ``summary`` — counts by status * ``platform`` — OS / arch / Python interpreter info """ + # Issue #195: dispatch to absorbed sub-commands when --check is set. + check_arg = getattr(args, "check", None) + if check_arg: + return _dispatch_subcommands(args, workspace, check_arg) + verbose = bool(getattr(args, "verbose", False)) do_fix = bool(getattr(args, "fix", False)) diff --git a/scripts/commands/entrypoints.py b/scripts/commands/entrypoints.py index acae94df..e0a6102e 100644 --- a/scripts/commands/entrypoints.py +++ b/scripts/commands/entrypoints.py @@ -23,4 +23,8 @@ def execute(args, workspace): max_files=args.max_files) -register_command("entrypoints", "Map execution entry points", add_args, execute) +register_command("entrypoints", "Map execution entry points", add_args, execute, + +hidden=True, + +) diff --git a/scripts/commands/env_check.py b/scripts/commands/env_check.py index 2ca36538..a7c21b42 100644 --- a/scripts/commands/env_check.py +++ b/scripts/commands/env_check.py @@ -15,4 +15,10 @@ def execute(args, workspace): return check_env_vars(workspace, var_name=args.var_name) -register_command("env-check", "Audit environment variables", add_args, execute) +register_command("env-check", "Audit environment variables", add_args, execute, + +hidden=True, + +deprecated_alias_for='doctor', + +) diff --git a/scripts/commands/export_snapshot.py b/scripts/commands/export_snapshot.py deleted file mode 100644 index 74082f86..00000000 --- a/scripts/commands/export_snapshot.py +++ /dev/null @@ -1,168 +0,0 @@ -"""Export-snapshot command — Export the CodeLens graph as a compressed archive. - -Issue #12: each developer previously had to ``codelens scan`` separately. -``export-snapshot`` packages the SQLite graph tables (``graph_nodes``, -``graph_edges``, ``symbols``, ``refs``, ``files``) as a gzip-compressed -archive that can be committed to the repo and shared with the team via -``codelens import-snapshot``. - -The snapshot contains **graph metadata only** — file paths, symbol -names/kinds/line spans, edge relationships, content hashes, timestamps. -It NEVER contains file content. - -Usage:: - - codelens export-snapshot [workspace] [--output path] [--db-path path] - -Example output (the human-readable message is also embedded in the JSON -result under the ``message`` key and printed to stderr so stdout stays -machine-parseable):: - - Snapshot exported: .codelens/snapshot.codelens.gz (1.2 MB) -""" - -import os -import sys -from typing import Any, Dict, Optional - -from commands import register_command -from utils import default_db_path, logger -from snapshot_io import ( - DEFAULT_SNAPSHOT_FILENAME, - SNAPSHOT_TABLES, - build_snapshot, - default_snapshot_path, - format_size, - write_snapshot, -) - - -def add_args(parser): - """Add export-snapshot arguments to the parser.""" - parser.add_argument("workspace", nargs="?", default=None, - help="Path to workspace root (auto-detected if omitted)") - parser.add_argument("--output", default=None, - help="Output path for the snapshot archive " - "(default: .codelens/snapshot.codelens.gz)") - parser.add_argument("--db-path", default=None, - help="Custom path for the source SQLite database file") - - -def execute(args, workspace): - """Execute the export-snapshot command.""" - output = getattr(args, "output", None) - db_path = getattr(args, "db_path", None) - return cmd_export_snapshot(workspace, output_path=output, db_path=db_path) - - -def cmd_export_snapshot( - workspace: str, - output_path: Optional[str] = None, - db_path: Optional[str] = None, -) -> Dict[str, Any]: - """Export the CodeLens graph database to a compressed snapshot archive. - - Reads the SQLite graph tables (graph_nodes, graph_edges, symbols, - refs, files) and writes them as gzip-compressed JSON to - ``output_path`` (default: ``/.codelens/snapshot.codelens.gz``). - - Args: - workspace: Path to the workspace root. - output_path: Optional explicit output path. If None, defaults to - ``/.codelens/snapshot.codelens.gz``. - db_path: Optional source SQLite db path. Defaults to - ``/.codelens/codelens.db``. - - Returns: - Dict with keys: ``status``, ``message``, ``snapshot_path``, - ``size_bytes``, ``size_human``, ``header``, ``workspace``. - On error: ``status="error"`` with an ``error`` message. - """ - workspace = os.path.abspath(workspace) - - # Resolve source db path - effective_db = db_path or default_db_path(workspace) - - # Resolve output path. If the user gave a relative path, resolve it - # against the workspace root so the default ``.codelens/...`` form - # works regardless of cwd. - if output_path: - effective_output = output_path if os.path.isabs(output_path) \ - else os.path.join(workspace, output_path) - else: - effective_output = default_snapshot_path(workspace) - - # Build the snapshot (reads the DB, raises FileNotFoundError if absent) - try: - snapshot = build_snapshot(workspace, db_path=effective_db) - except FileNotFoundError as exc: - return { - "status": "error", - "error": str(exc), - "workspace": workspace, - } - except Exception as exc: # pragma: no cover - defensive - logger.error(f"export-snapshot: build failed: {exc}", exc_info=True) - return { - "status": "error", - "error": f"Failed to build snapshot: {exc}", - "workspace": workspace, - } - - header = snapshot.get("header", {}) - - # Write gzip-compressed JSON to disk - try: - size_bytes = write_snapshot(snapshot, effective_output) - except OSError as exc: - return { - "status": "error", - "error": f"Failed to write snapshot to {effective_output}: {exc}", - "workspace": workspace, - } - - size_human = format_size(size_bytes) - - # Human-readable message — also surfaced in the JSON result so - # scripted consumers can read it. Matches the issue #12 format: - # "Snapshot exported: .codelens/snapshot.codelens.gz (1.2 MB)" - # Use a workspace-relative path in the message when possible so the - # message is portable across machines (the snapshot is meant to be - # committed to the repo and shared). - try: - rel_output = os.path.relpath(effective_output, workspace) - # If the output lives outside the workspace, relpath produces - # something with '..' — fall back to the absolute path in that case. - if rel_output.startswith(".."): - display_path = effective_output - else: - display_path = rel_output - except (ValueError, OSError): - display_path = effective_output - - message = f"Snapshot exported: {display_path} ({size_human})" - - # Print the message to stderr so stdout (JSON) stays machine-clean, - # matching the convention used by scan's status messages. - print(message, file=sys.stderr) - - return { - "status": "ok", - "message": message, - "snapshot_path": effective_output, - "display_path": display_path, - "size_bytes": size_bytes, - "size_human": size_human, - "header": header, - "workspace": workspace, - "tables": list(SNAPSHOT_TABLES), - } - - -register_command( - "export-snapshot", - "Export the CodeLens graph as a compressed snapshot archive (.codelens.gz) " - "for team sharing (issue #12)", - add_args, - execute, -) diff --git a/scripts/commands/fix.py b/scripts/commands/fix.py deleted file mode 100644 index d10f442c..00000000 --- a/scripts/commands/fix.py +++ /dev/null @@ -1,57 +0,0 @@ -"""CodeLens fix command — Auto-fix issues with confidence scoring.""" - -import sys -import os - -SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__)) -sys.path.insert(0, SCRIPT_DIR) - -from commands import register_command - - -def add_args(parser): - parser.add_argument('--categories', nargs='+', - choices=['secrets_mask', 'dead_code', 'debug_leak', 'import_cleanup', 'todo_fixme'], - default=None, - help='Fix categories to apply (default: all)') - parser.add_argument('--dry-run', action='store_true', default=True, - help='Show what would be changed without modifying files (default)') - parser.add_argument('--apply', dest='dry_run', action='store_false', - help='Actually apply the fixes') - parser.add_argument('--min-confidence', type=float, default=0.5, - help='Minimum confidence threshold (0-1, default: 0.5)') - parser.add_argument('--max-risk', choices=['safe', 'moderate', 'risky', 'dangerous'], - default='risky', - help='Maximum risk level to apply (default: risky)') - parser.add_argument('--max-fixes', type=int, default=50, - help='Maximum number of fixes to apply (default: 50)') - - -def execute(args, workspace): - from autofix_engine import run_autofix, RISK_SAFE, RISK_MODERATE, RISK_RISKY, RISK_DANGEROUS - - risk_map = { - 'safe': RISK_SAFE, - 'moderate': RISK_MODERATE, - 'risky': RISK_RISKY, - 'dangerous': RISK_DANGEROUS, - } - - result = run_autofix( - workspace=workspace, - categories=args.categories, - min_confidence=args.min_confidence, - max_risk=risk_map.get(args.max_risk, RISK_RISKY), - dry_run=args.dry_run, - max_fixes=args.max_fixes, - ) - - return result - - -register_command( - 'fix', - 'Auto-fix issues with confidence scoring (dry-run by default)', - add_args, - execute, -) diff --git a/scripts/commands/git_status.py b/scripts/commands/git_status.py index 3a7b08c4..600cf3f5 100644 --- a/scripts/commands/git_status.py +++ b/scripts/commands/git_status.py @@ -134,4 +134,6 @@ def cmd_git_status(workspace: str) -> Dict[str, Any]: "Show git-aware scan state (SHA, branch, changed files, rescan recommendation)", add_args, execute, +hidden=True, +deprecated_alias_for='history', ) diff --git a/scripts/commands/graph.py b/scripts/commands/graph.py new file mode 100644 index 00000000..82c2f08f --- /dev/null +++ b/scripts/commands/graph.py @@ -0,0 +1,85 @@ +"""graph command — raw Cypher graph query for power users (issue #195). + +This is the power-user entry point that wraps the same query-graph engine +as ``search --mode graph``, but defaults to raw Cypher pass-through with +no niceties. Casual callers should prefer ``search --mode graph``. + +Usage: + codelens graph "MATCH (n:Function) WHERE n.id CONTAINS 'auth' RETURN n LIMIT 10" + codelens graph "MATCH (n)-[r:CALLS]->(m) RETURN n.id, m.id LIMIT 50" + codelens graph "MATCH (n) WHERE n.id CONTAINS x" --validate + +Output: ``{"s":"ok", "st":{...}, "r":[...]}`` shape (rows under ``r``). +""" + +# @WHO: scripts/commands/graph.py +# @WHAT: Raw Cypher-subset graph query (power-user mode). +# @PART: commands +# @ENTRY: execute() + +import argparse +from typing import Any, Dict + +from commands import register_command + + +def add_args(parser): + """Add graph-specific arguments to the parser.""" + parser.formatter_class = argparse.RawDescriptionHelpFormatter + parser.epilog = ( + "Raw Cypher-subset query (issue #195):\n" + " Supported clauses: MATCH, WHERE, RETURN, LIMIT, EXISTS\n" + " Node types: Function, Class, Module, File, Route, Variable, ...\n" + " Edge types: CALLS, IMPORTS, DEFINES, REFERENCES, CONTAINS\n" + "\n" + "Examples:\n" + " codelens graph \"MATCH (n:Function) WHERE n.id CONTAINS 'auth' RETURN n LIMIT 10\"\n" + " codelens graph \"MATCH (n)-[r:CALLS]->(m) RETURN n.id, m.id LIMIT 50\"\n" + " codelens graph \"MATCH (n) WHERE n.id CONTAINS x\" --validate\n" + "\n" + "Casual callers should prefer ``codelens search --mode graph``." + ) + parser.add_argument("query", help="Cypher-subset query string") + parser.add_argument("workspace", nargs="?", default=None, + help="Path to workspace root (auto-detected if omitted)") + parser.add_argument("--limit", type=int, default=None, + help="Row cap (appended as LIMIT if not already present)") + parser.add_argument("--validate", action="store_true", default=False, + help="Validate the query without executing it") + parser.add_argument("--db-path", default=None, + help="Custom SQLite database path") + + +def execute(args, workspace): + """Execute a raw Cypher-subset query against the graph DB. + + @FLOW: GRAPH_QUERY + @CALLS: commands.query_graph.execute() -> dict + @MUTATES: nothing (read-only) + """ + # Delegate to the existing query-graph executor — it already handles + # LIMIT injection, --validate, and --db-path. ``graph`` is the bare + # power-user surface; ``search --mode graph`` is the friendly wrapper. + from commands.query_graph import execute as _qg_execute + result = _qg_execute(args, workspace) + # Re-shape to the umbrella-consistent {s, st, r} form. + if isinstance(result, dict) and "s" in result: + return result + if not isinstance(result, dict): + return {"s": "ok", "st": {}, "r": [{"result": result}]} + # query-graph returns {"status": "ok", "rows": [...], "query": ..., ...} + rows = result.pop("rows", None) + new_result = { + "s": result.pop("status", "ok"), + "st": result, + "r": rows if rows is not None else [], + } + return new_result + + +register_command( + "graph", + "Raw Cypher-subset graph query (power-user; casual callers use `search --mode graph`)", + add_args, + execute, +) diff --git a/scripts/commands/graph_schema.py b/scripts/commands/graph_schema.py index f892912d..a2b8a93e 100644 --- a/scripts/commands/graph_schema.py +++ b/scripts/commands/graph_schema.py @@ -124,4 +124,6 @@ def execute(args, workspace): "Return the shape of the code graph (node/edge counts, type distribution, indexes)", add_args, execute, +hidden=True, +deprecated_alias_for='api-map', ) diff --git a/scripts/commands/guard.py b/scripts/commands/guard.py deleted file mode 100644 index 140c1890..00000000 --- a/scripts/commands/guard.py +++ /dev/null @@ -1,552 +0,0 @@ -"""CodeLens guard command — Pre/post-write verification for AI agents. - -This command provides real-time verification that AI agents can use before -and after making code changes. It integrates with the MCP server to provide -a "guard" mode that: - -1. Pre-write check: Verify the change is safe (no collisions, dead code references, etc.) -2. Post-write check: Verify the change didn't introduce new issues -3. Diff-aware analysis: Only analyze changed files for fast feedback -4. Persistent state: Track what the codebase looked like before the change - -This is the "killer feature" that no other code analysis tool has — -purpose-built integration for AI agent coding workflows. -""" - -import sys -import os -import json -import re -import time -import hashlib -from typing import Any, Dict, List, Optional - -SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__)) -sys.path.insert(0, SCRIPT_DIR) - -from commands import register_command - - -def add_args(parser): - sub = parser.add_subparsers(dest='guard_action', help='Guard action') - - # pre: Check before writing - pre = sub.add_parser('pre', help='Pre-write check — verify change is safe') - pre.add_argument('--file', required=True, help='File that will be modified') - pre.add_argument('--symbol', help='Symbol that will be added/modified/removed') - pre.add_argument('--action', choices=['create', 'modify', 'delete', 'rename'], - default='modify', help='Type of change (default: modify)') - - # post: Check after writing - post = sub.add_parser('post', help='Post-write check — verify no new issues') - post.add_argument('--file', required=True, help='File that was modified') - post.add_argument('--diff', help='Git-style diff of the changes') - - # snapshot: Save current state - snap = sub.add_parser('snapshot', help='Save a snapshot of current analysis state') - - # verify: Compare current state with snapshot - verify = sub.add_parser('verify', help='Verify codebase against saved snapshot') - verify.add_argument('--snapshot', help='Snapshot ID to compare against (default: latest)') - - -def execute(args, workspace): - action = getattr(args, 'guard_action', None) - - if action == 'pre': - return _pre_write_check(args, workspace) - elif action == 'post': - return _post_write_check(args, workspace) - elif action == 'snapshot': - return _save_snapshot(workspace) - elif action == 'verify': - return _verify_against_snapshot(args, workspace) - else: - return { - "status": "error", - "error": "Specify a guard action: pre, post, snapshot, verify", - "examples": [ - "codelens guard pre --file src/app.py --symbol my_func --action modify", - "codelens guard post --file src/app.py", - "codelens guard snapshot", - "codelens guard verify", - ] - } - - -def _pre_write_check(args, workspace) -> Dict[str, Any]: - """Check if a planned code change is safe. - - Verifies: - 1. The file exists (for modify/delete) - 2. No dead code references the symbol (would break) - 3. No circular dependency would be created - 4. The symbol doesn't already exist (for create) - 5. The symbol isn't in use elsewhere (for delete) - """ - target_file = args.file - symbol = args.symbol - action = args.action - issues = [] - warnings = [] - info = [] - - abs_path = os.path.join(workspace, target_file) if not os.path.isabs(target_file) else target_file - - # Check file existence - if action in ('modify', 'delete'): - if not os.path.exists(abs_path): - issues.append({ - "type": "file_not_found", - "message": f"File {target_file} does not exist — cannot {action}", - "severity": "critical", - }) - - if action == 'create': - if os.path.exists(abs_path): - warnings.append({ - "type": "file_exists", - "message": f"File {target_file} already exists — will overwrite", - "severity": "warning", - }) - - if symbol: - # Check registry for the symbol - try: - from registry import load_backend_registry - registry = load_backend_registry(workspace) - - if registry: - nodes = registry.get('nodes', {}) - - # Check if symbol exists - symbol_exists = False - symbol_refs = 0 - symbol_status = "unknown" - - for node_name, node_data in nodes.items(): - if isinstance(node_data, dict): - if node_name == symbol or symbol in node_name: - symbol_exists = True - symbol_refs = node_data.get('ref_count', 0) - symbol_status = node_data.get('status', 'unknown') - - if action == 'create': - if symbol_status == 'active': - issues.append({ - "type": "symbol_exists", - "message": f"Symbol '{symbol}' already exists and is active ({symbol_refs} refs)", - "severity": "critical", - }) - elif symbol_status == 'dead': - warnings.append({ - "type": "symbol_dead", - "message": f"Symbol '{symbol}' exists but is dead — safe to reuse", - "severity": "info", - }) - - elif action == 'delete': - if symbol_refs > 0: - issues.append({ - "type": "symbol_in_use", - "message": f"Symbol '{symbol}' has {symbol_refs} references — deleting will break them", - "severity": "critical", - "affected_refs": symbol_refs, - }) - - elif action == 'modify': - info.append({ - "type": "symbol_info", - "message": f"Symbol '{symbol}' is {symbol_status} with {symbol_refs} references", - "severity": "info", - }) - break - - if not symbol_exists and action == 'modify': - warnings.append({ - "type": "symbol_not_found", - "message": f"Symbol '{symbol}' not found in registry — new symbol?", - "severity": "warning", - }) - - except Exception as e: - warnings.append({ - "type": "registry_error", - "message": f"Could not check registry: {e}", - "severity": "warning", - }) - - # Check for potential circular dependencies - if symbol and action in ('create', 'modify'): - try: - from circular_engine import detect_cycles - result = detect_cycles(workspace) - cycles = result.get('cycles', []) - for cycle in cycles: - cycle_str = str(cycle) - if target_file in cycle_str: - warnings.append({ - "type": "circular_dep", - "message": f"File {target_file} is in a circular dependency cycle", - "severity": "warning", - "cycle": cycle, - }) - except Exception: - pass - - # Determine overall safety - safe = len(issues) == 0 - risk_level = "safe" if not issues and not warnings else \ - "moderate" if not issues else "dangerous" - - # Generate recommendations - recommendations = [] - if issues: - recommendations.append("STOP: Critical issues found — resolve before proceeding") - for issue in issues: - recommendations.append(f" → {issue['message']}") - elif warnings: - recommendations.append("CAUTION: Warnings found — review before proceeding") - for warning in warnings: - recommendations.append(f" → {warning['message']}") - else: - recommendations.append("GREEN: No issues detected — safe to proceed") - - return { - "status": "ok", - "action": action, - "file": target_file, - "symbol": symbol, - "safe": safe, - "risk_level": risk_level, - "issues": issues, - "warnings": warnings, - "info": info, - "recommendations": recommendations, - } - - -def _post_write_check(args, workspace) -> Dict[str, Any]: - """Check if a code change introduced new issues. - - Compares current analysis with the last snapshot to find: - 1. New dead code introduced - 2. New secrets leaked - 3. New circular dependencies - 4. New complexity issues - 5. New debug leaks - """ - target_file = args.file - new_issues = [] - resolved_issues = [] - persisting_issues = [] - - abs_path = os.path.join(workspace, target_file) if not os.path.isabs(target_file) else target_file - - # Run targeted analysis on the changed file - if not os.path.exists(abs_path): - return { - "status": "error", - "error": f"File {target_file} does not exist", - } - - # Load the last snapshot - snapshot_dir = os.path.join(workspace, ".codelens", "guard_snapshots") - latest_snapshot = _load_latest_snapshot(snapshot_dir) - - # Run analysis on the changed file - file_issues = _analyze_file(workspace, target_file) - - # Compare with snapshot - if latest_snapshot: - prev_file_issues = latest_snapshot.get('files', {}).get(target_file, {}).get('issues', []) - - prev_set = {(i.get('type', ''), i.get('category', ''), i.get('line', 0)) - for i in prev_file_issues} - curr_set = {(i.get('type', ''), i.get('category', ''), i.get('line', 0)) - for i in file_issues} - - new_keys = curr_set - prev_set - resolved_keys = prev_set - curr_set - persisting_keys = curr_set & prev_set - - for issue in file_issues: - key = (issue.get('type', ''), issue.get('category', ''), issue.get('line', 0)) - if key in new_keys: - new_issues.append(issue) - elif key in persisting_keys: - persisting_issues.append(issue) - - for issue in prev_file_issues: - key = (issue.get('type', ''), issue.get('category', ''), issue.get('line', 0)) - if key in resolved_keys: - resolved_issues.append(issue) - else: - # No snapshot — all issues are new - new_issues = file_issues - - # Determine if the change is clean - critical_new = [i for i in new_issues if i.get('severity') in ('critical', 'high')] - clean = len(critical_new) == 0 - - return { - "status": "ok", - "file": target_file, - "clean": clean, - "new_issues": new_issues, - "resolved_issues": resolved_issues, - "persisting_issues": persisting_issues, - "summary": { - "new": len(new_issues), - "resolved": len(resolved_issues), - "persisting": len(persisting_issues), - "critical_new": len(critical_new), - }, - "recommendations": _generate_post_recommendations(new_issues, resolved_issues), - } - - -def _save_snapshot(workspace) -> Dict[str, Any]: - """Save a snapshot of the current analysis state.""" - snapshot_dir = os.path.join(workspace, ".codelens", "guard_snapshots") - os.makedirs(snapshot_dir, exist_ok=True) - - timestamp = int(time.time()) - snapshot_id = f"snapshot_{timestamp}" - snapshot_file = os.path.join(snapshot_dir, f"{snapshot_id}.json") - - # Run quick analysis - file_data = {} - - # Scan workspace for source files - source_exts = {'.py', '.js', '.ts', '.tsx', '.jsx', '.rs', '.go', '.vue', '.svelte'} - for root, dirs, files in os.walk(workspace): - dirs[:] = [d for d in dirs if not d.startswith('.') and d not in - ('node_modules', '__pycache__', '.codelens', 'venv', '.venv', 'dist', 'build')] - for f in files: - ext = os.path.splitext(f)[1].lower() - if ext in source_exts: - rel_path = os.path.relpath(os.path.join(root, f), workspace) - file_data[rel_path] = { - "issues": _analyze_file(workspace, rel_path), - "timestamp": timestamp, - } - - snapshot = { - "id": snapshot_id, - "timestamp": timestamp, - "files": file_data, - "total_files": len(file_data), - "total_issues": sum(len(v.get('issues', [])) for v in file_data.values()), - } - - try: - with open(snapshot_file, 'w', encoding='utf-8') as f: - json.dump(snapshot, f, indent=2, ensure_ascii=False) - except (IOError, OSError) as e: - return {"status": "error", "error": f"Failed to save snapshot: {e}"} - - # Keep only last 10 snapshots - _cleanup_old_snapshots(snapshot_dir, keep=10) - - return { - "status": "ok", - "snapshot_id": snapshot_id, - "files_snapshotted": len(file_data), - "total_issues_captured": snapshot["total_issues"], - "snapshot_file": snapshot_file, - } - - -def _verify_against_snapshot(args, workspace) -> Dict[str, Any]: - """Verify current codebase against a saved snapshot.""" - snapshot_dir = os.path.join(workspace, ".codelens", "guard_snapshots") - - if not os.path.isdir(snapshot_dir): - return { - "status": "error", - "error": "No snapshots found. Run 'codelens guard snapshot' first.", - } - - # Load the specified or latest snapshot - snapshot = None - if args and args.snapshot: - snapshot_file = os.path.join(snapshot_dir, f"{args.snapshot}.json") - if os.path.exists(snapshot_file): - with open(snapshot_file, 'r') as f: - snapshot = json.load(f) - else: - snapshot = _load_latest_snapshot(snapshot_dir) - - if not snapshot: - return { - "status": "error", - "error": "No snapshot found.", - } - - # Compare current state with snapshot - new_issues = [] - resolved_issues = [] - changed_files = [] - - for rel_path, snap_data in snapshot.get('files', {}).items(): - current_issues = _analyze_file(workspace, rel_path) - - prev_set = {(i.get('type', ''), i.get('category', ''), i.get('line', 0)) - for i in snap_data.get('issues', [])} - curr_set = {(i.get('type', ''), i.get('category', ''), i.get('line', 0)) - for i in current_issues} - - new_keys = curr_set - prev_set - resolved_keys = prev_set - curr_set - - if new_keys or resolved_keys: - changed_files.append(rel_path) - - for issue in current_issues: - key = (issue.get('type', ''), issue.get('category', ''), issue.get('line', 0)) - if key in new_keys: - new_issues.append(issue) - - for issue in snap_data.get('issues', []): - key = (issue.get('type', ''), issue.get('category', ''), issue.get('line', 0)) - if key in resolved_keys: - resolved_issues.append(issue) - - return { - "status": "ok", - "snapshot_id": snapshot.get('id', 'unknown'), - "changed_files": changed_files, - "new_issues": len(new_issues), - "resolved_issues": len(resolved_issues), - "new_issues_detail": new_issues[:20], - "resolved_issues_detail": resolved_issues[:20], - "clean": len([i for i in new_issues if i.get('severity') in ('critical', 'high')]) == 0, - } - - -def _analyze_file(workspace: str, rel_path: str) -> List[Dict]: - """Run quick analysis on a single file and return issues.""" - issues = [] - - abs_path = os.path.join(workspace, rel_path) - if not os.path.exists(abs_path): - return issues - - # Issue #58, Phase 1: validate the agent-supplied path stays inside - # the workspace before reading. ``--file`` on `guard pre` / `guard - # post` is the most direct agent-controlled file-read surface, so - # it's the highest-leverage place to enforce path confinement. - # ``safe_read_file_within_project`` returns None on refusal (and - # logs the refusal at WARNING level), which preserves the legacy - # "empty issues list" behavior for unreadable files. - from utils import safe_read_file_within_project - content = safe_read_file_within_project(abs_path, workspace) - if not content: - return issues - - lines = content.split('\n') - ext = os.path.splitext(rel_path)[1].lower() - - # Quick pattern-based checks (no engine overhead) - for line_no, line in enumerate(lines, 1): - stripped = line.strip() - - # Secrets check - if ext in ('.py', '.js', '.ts', '.env'): - if re.search(r'(?:password|secret|api_key|token)\s*[=:]\s*["\'][^"\']{8,}["\']', stripped, re.I): - issues.append({ - "type": "secret", - "category": "hardcoded_secret", - "line": line_no, - "severity": "high", - "message": f"Potential hardcoded secret on line {line_no}", - }) - - # Debug leak check - if ext in ('.js', '.ts', '.tsx', '.jsx'): - if re.search(r'console\.(log|debug|info)\s*\(', stripped): - issues.append({ - "type": "debug_leak", - "category": "console_log", - "line": line_no, - "severity": "low", - "message": f"console.log on line {line_no}", - }) - elif ext == '.py': - if re.search(r'^\s*print\s*\(', stripped) and '__main__' not in content: - issues.append({ - "type": "debug_leak", - "category": "print_statement", - "line": line_no, - "severity": "low", - "message": f"print() statement on line {line_no}", - }) - - # TODO/FIXME check - if re.search(r'#\s*(TODO|FIXME|HACK|XXX)', stripped, re.I): - issues.append({ - "type": "todo_fixme", - "category": "todo_fixme", - "line": line_no, - "severity": "info", - "message": f"TODO/FIXME marker on line {line_no}", - }) - - return issues - - -def _load_latest_snapshot(snapshot_dir: str) -> Optional[Dict]: - """Load the most recent snapshot.""" - if not os.path.isdir(snapshot_dir): - return None - - snapshots = sorted([f for f in os.listdir(snapshot_dir) if f.endswith('.json')], - reverse=True) - if not snapshots: - return None - - latest = os.path.join(snapshot_dir, snapshots[0]) - try: - with open(latest, 'r') as f: - return json.load(f) - except (IOError, json.JSONDecodeError): - return None - - -def _cleanup_old_snapshots(snapshot_dir: str, keep: int = 10): - """Remove old snapshots, keeping only the N most recent.""" - if not os.path.isdir(snapshot_dir): - return - - snapshots = sorted([f for f in os.listdir(snapshot_dir) if f.endswith('.json')]) - while len(snapshots) > keep: - os.remove(os.path.join(snapshot_dir, snapshots.pop(0))) - - -def _generate_post_recommendations(new_issues: List[Dict], - resolved_issues: List[Dict]) -> List[str]: - """Generate recommendations from post-write check.""" - recs = [] - - critical = [i for i in new_issues if i.get('severity') in ('critical', 'high')] - if critical: - recs.append(f"URGENT: {len(critical)} new critical/high issues introduced") - for c in critical[:3]: - recs.append(f" → {c.get('message', 'Unknown issue')}") - - if resolved_issues: - recs.append(f"Good: {len(resolved_issues)} issues resolved by this change") - - if not new_issues: - recs.append("Clean change: no new issues introduced") - - return recs[:10] - - -register_command( - 'guard', - 'Pre/post-write verification for AI agents (guard pre/post/snapshot/verify)', - add_args, - execute, -) diff --git a/scripts/commands/history.py b/scripts/commands/history.py index 4c1b9ea1..eb7d28ce 100644 --- a/scripts/commands/history.py +++ b/scripts/commands/history.py @@ -8,15 +8,95 @@ def add_args(parser): parser.add_argument("workspace", nargs="?", default=None, help="Path to workspace root (auto-detected if omitted)") + parser.add_argument("--check", default=None, + help="Issue #195: comma-separated sub-analyses. " + "Choices: history, ownership, git-status. Default: history.") parser.add_argument("--chart", action="store_true", - help="Generate HTML trend chart") + help="history: generate HTML trend chart") parser.add_argument("--list", action="store_true", - help="List all available snapshots") + help="history: list all available snapshots") parser.add_argument("--compare", nargs=2, metavar=("SNAPSHOT1", "SNAPSHOT2"), - help="Compare two snapshots by filename") + help="history: compare two snapshots by filename") + # ownership passthroughs + parser.add_argument("--file", default=None, + help="ownership: file path filter") + parser.add_argument("--function", dest="function_name", default=None, + help="ownership: function name filter") -def execute(args, workspace): +# Issue #195: sub-command dispatch table for the history umbrella. +_HISTORY_SUBCOMMANDS = { + "history": None, # handled inline + "ownership": "commands.ownership", + "git-status": "commands.git_status", +} + + +def _dispatch_subcommands(args, workspace, check_arg): + """Dispatch to one or more absorbed sub-commands per --check.""" + import importlib as _il + import sys as _sys + parts = [c.strip() for c in check_arg.split(",") if c.strip()] + invalid = [p for p in parts if p not in _HISTORY_SUBCOMMANDS] + if invalid: + print( + f"[CodeLens] history: unknown --check category '{','.join(invalid)}'. " + f"Valid: {', '.join(_HISTORY_SUBCOMMANDS.keys())}", + file=_sys.stderr, + ) + _sys.exit(1) + if not parts: + parts = ["history"] + + results = [] + checks_failed = 0 + for check_name in parts: + try: + if check_name == "history": + sub_result = _run_legacy_history(args, workspace) + else: + mod = _il.import_module(_HISTORY_SUBCOMMANDS[check_name]) + sub_args = _build_subnamespace(args, check_name) + sub_result = mod.execute(sub_args, workspace) + if not isinstance(sub_result, dict): + sub_result = {"status": "ok", "result": sub_result} + sub_result["_check"] = check_name + results.append(sub_result) + except Exception as exc: + checks_failed += 1 + results.append({ + "_check": check_name, + "s": "error", + "error": str(exc), + "error_type": type(exc).__name__, + }) + print(f"[CodeLens] history: --check {check_name} failed: {exc}", + file=_sys.stderr) + + return { + "s": "ok" if checks_failed == 0 else "partial", + "st": {"checks_requested": len(parts), "checks_failed": checks_failed}, + "r": results, + } + + +def _build_subnamespace(base_args, check_name): + """Build a synthetic namespace for the dispatched sub-command.""" + import argparse as _ap + ns = _ap.Namespace() + for attr in ("format", "top", "max_tokens", "lite", "deep", "db_path", + "diff_base", "diff_scope", "disable_suppression", + "codelens_ignore_pattern"): + setattr(ns, attr, getattr(base_args, attr, None)) + ns.workspace = getattr(base_args, "workspace", None) + if check_name == "ownership": + ns.file = getattr(base_args, "file", None) + ns.function_name = getattr(base_args, "function_name", None) + return ns + + +def _run_legacy_history(args, workspace): + """Run the original history.execute logic (issue #195: absorbed).""" should_chart = getattr(args, 'chart', False) should_list = getattr(args, 'list', False) compare = getattr(args, 'compare', None) @@ -51,6 +131,14 @@ def execute(args, workspace): return get_trend_data(workspace) +def execute(args, workspace): + # Issue #195: dispatch to absorbed sub-commands when --check is set. + check_arg = getattr(args, "check", None) + if check_arg: + return _dispatch_subcommands(args, workspace, check_arg) + return _run_legacy_history(args, workspace) + + def _generate_trend_chart(workspace: str) -> dict: """Generate an HTML trend chart from historical data.""" from history_engine import get_trend_data diff --git a/scripts/commands/impact.py b/scripts/commands/impact.py index cba88ee7..55f9415c 100644 --- a/scripts/commands/impact.py +++ b/scripts/commands/impact.py @@ -1,40 +1,131 @@ -"""Impact command — Analyze change impact for a symbol.""" +"""impact command — change-impact & dataflow analysis (issue #195 consolidation). + +Umbrella command that absorbs: + - impact Change impact for a symbol (default) + - diff Compare registry snapshots (--git-aware for git-diff delta) + - dataflow Trace data flow source→sink with cross-file call graph + +Usage: + codelens impact --name handleAuth # impact (default) + codelens impact --check impact --name handleAuth + codelens impact --check diff --git-aware + codelens impact --check dataflow --source src/api.ts --sink db.query + +Output: ``{"s":"ok", "st":{...}, "r":[...]}``. +""" + +# @WHO: scripts/commands/impact.py +# @WHAT: Umbrella command for change-impact & dataflow analysis. +# @PART: commands +# @ENTRY: execute() + +import argparse +import importlib +import sys +from typing import Any, Dict, List -from impact_engine import analyze_impact from commands import register_command +_CHECKS = { + "impact": { + "module": None, # handled inline (legacy impact.execute logic below) + "help": "Analyze change impact for a symbol", + }, + "diff": { + "module": "commands.diff", + "help": "Compare registry snapshots (--git-aware for git-diff delta)", + }, + "dataflow": { + "module": "commands.dataflow", + "help": "Trace data flow source→sink with cross-file call graph", + }, +} + +ALL_CHECKS = list(_CHECKS.keys()) + + def add_args(parser): - parser.add_argument("name", help="Symbol name to analyze") + """Add impact-specific arguments to the parser.""" + parser.formatter_class = argparse.RawDescriptionHelpFormatter + parser.epilog = ( + "Sub-analyses (issue #195):\n" + " impact Analyze change impact for a symbol (default)\n" + " diff Compare registry snapshots (--git-aware for git-diff delta)\n" + " dataflow Trace data flow source→sink with cross-file call graph\n" + "\n" + "Examples:\n" + " codelens impact . --name handleAuth # impact (default)\n" + " codelens impact . --check diff --git-aware\n" + " codelens impact . --check dataflow --source src/api.ts --sink db.query\n" + ) parser.add_argument("workspace", nargs="?", default=None, help="Path to workspace root (auto-detected if omitted)") + parser.add_argument("--check", default=None, + help=f"Comma-separated sub-analyses. " + f"Choices: {', '.join(ALL_CHECKS)}. Default: impact.") + parser.add_argument("--name", default=None, + help="impact: symbol name to analyze") parser.add_argument("--action", choices=["modify", "delete"], default="modify", - help="Planned action (modify or delete)") - parser.add_argument("--domain", choices=["frontend", "backend", "auto"], default="auto", - help="Domain to analyze") - parser.add_argument("--depth", type=int, default=5, help="Trace depth (default 5)") + help="impact: planned action (default: modify)") + parser.add_argument("--domain", default="auto", + help="impact: frontend|backend|auto (default: auto)") + parser.add_argument("--depth", type=int, default=None, + help="impact/dataflow: trace depth (default: impact=5, dataflow=15)") + # diff passthroughs + parser.add_argument("--snapshot1", default=None, help="diff: first snapshot path") + parser.add_argument("--snapshot2", default=None, help="diff: second snapshot path") + parser.add_argument("--list-snapshots", action="store_true", default=False, + help="diff: list available snapshots and exit") + parser.add_argument("--git-aware", action="store_true", default=False, + help="diff: use git-diff delta + impact") + # dataflow passthroughs + parser.add_argument("--source", default=None, help="dataflow: source function") + parser.add_argument("--sink", default=None, help="dataflow: sink function") + parser.add_argument("--max-files", type=int, default=None, + help="dataflow: file cap (default 3000)") + parser.add_argument("--timeout", type=int, default=None, + help="dataflow: timeout in seconds (default 120)") + parser.add_argument("--cross-file", action="store_true", default=False, + help="dataflow: force cross-file analysis") + parser.add_argument("--no-cross-file", action="store_true", default=False, + help="dataflow: disable cross-file analysis") + parser.add_argument("--language", default=None, + help="dataflow: python|javascript|typescript") + parser.add_argument("--call-graph-only", action="store_true", default=False, + help="dataflow: return only the call graph, no taint") -def execute(args, workspace): - result = analyze_impact( - args.name, workspace, - action=args.action, - domain=args.domain, - depth=args.depth - ) - # Add decision tree fields — derive risk_level from the engine's risk assessment - # to avoid contradictory "risk" vs "risk_level" values +def _parse_checks(check_arg: str) -> List[str]: + if not check_arg: + return ["impact"] + parts = [c.strip() for c in check_arg.split(",") if c.strip()] + invalid = [p for p in parts if p not in _CHECKS] + if invalid: + print( + f"[CodeLens] impact: unknown --check category '{','.join(invalid)}'. " + f"Valid: {', '.join(ALL_CHECKS)}", + file=sys.stderr, + ) + sys.exit(1) + return parts or ["impact"] + + +def _run_legacy_impact(args, workspace): + """Run the original impact.execute logic (issue #195: absorbed).""" + from impact_engine import analyze_impact + name = getattr(args, "name", None) or "" + action = getattr(args, "action", "modify") + domain = getattr(args, "domain", "auto") + depth = getattr(args, "depth", None) or 5 + result = analyze_impact(name, workspace, action=action, domain=domain, depth=depth) if result.get("status") == "ok": engine_risk = result.get("risk", "low") stats = result.get("stats", {}) direct_dependents = stats.get("direct_dependents", 0) indirect_dependents = stats.get("indirect_dependents", 0) affected_files = stats.get("affected_files", 0) - - # Use the engine's risk as the authoritative risk_level result["risk_level"] = engine_risk - - # Set recommended_action consistent with risk_level if engine_risk == "critical": result["recommended_action"] = "Critical risk. Consider refactoring to reduce dependencies first." elif engine_risk == "high": @@ -43,19 +134,93 @@ def execute(args, workspace): result["recommended_action"] = "Proceed with caution. Review affected code before changing." else: result["recommended_action"] = "Safe to proceed. No dependent code found." - - # Attach baseline confidence (medium = AST-based analysis per hybrid_engine.py - # docstring). HybridEngine.enhance_impact sets confidence=MEDIUM when LSP is - # not active (deep=False). When --deep is later applied in codelens.py - # post-processing, LSP verification may override this to HIGH or LOW. try: from hybrid_engine import create_hybrid_engine engine = create_hybrid_engine(workspace, deep=False) - engine.enhance_impact(result, args.name) + engine.enhance_impact(result, name) engine.cleanup() except Exception: result.setdefault("confidence", "medium") return result -register_command("impact", "Analyze change impact for a symbol", add_args, execute) +def _build_namespace(base_args, check_name: str) -> argparse.Namespace: + ns = argparse.Namespace() + for attr in ("format", "top", "max_tokens", "lite", "deep", "db_path", + "diff_base", "diff_scope", "disable_suppression", + "codelens_ignore_pattern"): + setattr(ns, attr, getattr(base_args, attr, None)) + ns.workspace = getattr(base_args, "workspace", None) + if check_name == "diff": + ns.snapshot1 = getattr(base_args, "snapshot1", None) + ns.snapshot2 = getattr(base_args, "snapshot2", None) + ns.list_snapshots = getattr(base_args, "list_snapshots", False) + ns.git_aware = getattr(base_args, "git_aware", False) + elif check_name == "dataflow": + ns.source = getattr(base_args, "source", None) + ns.sink = getattr(base_args, "sink", None) + ns.depth = getattr(base_args, "depth", None) or 15 + ns.max_files = getattr(base_args, "max_files", None) or 3000 + ns.timeout = getattr(base_args, "timeout", None) or 120 + ns.cross_file = getattr(base_args, "cross_file", False) + ns.no_cross_file = getattr(base_args, "no_cross_file", False) + ns.language = getattr(base_args, "language", None) + ns.call_graph_only = getattr(base_args, "call_graph_only", False) + return ns + + +def execute(args, workspace): + """Run one or more impact/dataflow checks and merge results. + + @FLOW: IMPACT_DISPATCH + @CALLS: _parse_checks() -> List[str] + _run_legacy_impact() | commands..execute() -> dict per sub + @MUTATES: nothing (read-only) + """ + checks = _parse_checks(getattr(args, "check", None)) + results: List[Dict[str, Any]] = [] + stats: Dict[str, Any] = {"checks_run": 0, "checks_failed": 0} + + for check_name in checks: + try: + if check_name == "impact": + sub_result = _run_legacy_impact(args, workspace) + else: + mod = importlib.import_module(_CHECKS[check_name]["module"]) + sub_args = _build_namespace(args, check_name) + sub_result = mod.execute(sub_args, workspace) + if not isinstance(sub_result, dict): + sub_result = {"status": "ok", "result": sub_result} + sub_result["_check"] = check_name + results.append(sub_result) + stats["checks_run"] += 1 + except Exception as exc: + stats["checks_failed"] += 1 + stats["checks_run"] += 1 + results.append({ + "_check": check_name, + "s": "error", + "error": str(exc), + "error_type": type(exc).__name__, + }) + print( + f"[CodeLens] impact: --check {check_name} failed: {exc}", + file=sys.stderr, + ) + + return { + "s": "ok" if stats["checks_failed"] == 0 else "partial", + "st": { + "checks_requested": len(checks), + **stats, + }, + "r": results, + } + + +register_command( + "impact", + "Change-impact & dataflow: impact (default) / diff / dataflow (issue #195)", + add_args, + execute, +) diff --git a/scripts/commands/import_snapshot.py b/scripts/commands/import_snapshot.py index 974386c7..ddfae6e1 100644 --- a/scripts/commands/import_snapshot.py +++ b/scripts/commands/import_snapshot.py @@ -175,4 +175,6 @@ def cmd_import_snapshot( "use --merge to deduplicate with the existing graph (issue #12)", add_args, execute, +hidden=True, +deprecated_alias_for='deps', ) diff --git a/scripts/commands/init.py b/scripts/commands/init.py index 4687354a..6bd78f7b 100644 --- a/scripts/commands/init.py +++ b/scripts/commands/init.py @@ -71,4 +71,10 @@ def cmd_init(workspace: str) -> Dict[str, Any]: } -register_command("init", "Initialize .codelens with auto-detected config", add_args, execute) +register_command("init", "Initialize .codelens with auto-detected config", add_args, execute, + +hidden=True, + +deprecated_alias_for='scan', + +) diff --git a/scripts/commands/list.py b/scripts/commands/list.py index 6682dd97..aef14ae6 100644 --- a/scripts/commands/list.py +++ b/scripts/commands/list.py @@ -121,4 +121,8 @@ def cmd_list(workspace: str, domain: str, filter_type: str = "all", } -register_command("list", "List entries with filter", add_args, execute) +register_command("list", "List entries with filter", add_args, execute, + +hidden=True, + +) diff --git a/scripts/commands/llm_framework.py b/scripts/commands/llm_framework.py deleted file mode 100644 index 679c6cec..00000000 --- a/scripts/commands/llm_framework.py +++ /dev/null @@ -1,262 +0,0 @@ -"""LLM command — inspect LLM framework config and test provider connectivity. - -Issue #63 Phase 1. Provides three subcommands:: - - codelens llm providers # list known providers + their env vars - codelens llm config # show the currently resolved config (no API key) - codelens llm ping # send a 1-token smoke prompt to verify the chain - -Why a command and not just tests? ---------------------------------- -The LLM framework is opt-in and env-var driven. Users need a way to -answer "did I configure this right?" without reading the source. The -``ping`` subcommand is the canonical end-to-end smoke test — it fails -fast on missing API keys, missing SDKs, and timeouts, with actionable -error messages. - -Phase 2 will add ``codelens llm-cache stats`` / ``clear`` here. -""" - -from __future__ import annotations - -import argparse -import os -import sys -from typing import Any, Dict, List, Optional - -from commands import register_command - - -def add_args(parser: argparse.ArgumentParser) -> None: - sub = parser.add_subparsers(dest="llm_subcommand") - - p_providers = sub.add_parser( - "providers", - help="List known LLM providers and the env vars each one reads.", - ) - p_providers.add_argument( - "--json", - action="store_true", - help="Emit JSON instead of a human-readable table.", - ) - - p_config = sub.add_parser( - "config", - help="Show the currently resolved LLM config (model, provider, key sources).", - ) - p_config.add_argument( - "--json", - action="store_true", - help="Emit JSON instead of a human-readable block.", - ) - - p_ping = sub.add_parser( - "ping", - help="Send a 1-token smoke prompt to verify provider + API key + SDK.", - ) - p_ping.add_argument( - "--model", - default=None, - help="Model name (defaults to CODELENS_LLM_MODEL).", - ) - p_ping.add_argument( - "--provider", - default=None, - help="Force a provider (bypass prefix dispatch).", - ) - p_ping.add_argument( - "--timeout", - type=float, - default=15.0, - help="Per-call timeout in seconds (default: 15).", - ) - p_ping.add_argument( - "--json", - action="store_true", - help="Emit JSON instead of a human-readable line.", - ) - - -def execute(args: argparse.Namespace, workspace: str) -> Dict[str, Any]: - sub = getattr(args, "llm_subcommand", None) - if sub is None: - # No subcommand → behave like `codelens llm config` for discoverability. - return _cmd_config(json_output=False) - if sub == "providers": - return _cmd_providers(json_output=getattr(args, "json", False)) - if sub == "config": - return _cmd_config(json_output=getattr(args, "json", False)) - if sub == "ping": - return _cmd_ping( - model=args.model, - provider=args.provider, - timeout=args.timeout, - json_output=args.json, - ) - return { - "status": "error", - "error": f"unknown llm subcommand: {sub!r}", - "available": ["providers", "config", "ping"], - } - - -# ─── Subcommands ─────────────────────────────────────────────────────────── - - -def _cmd_providers(*, json_output: bool) -> Dict[str, Any]: - """List the 6 providers + their env var hints + SDK pip names.""" - # Lazy import so the command module is importable even if the llm - # package itself failed to load (defensive — shouldn't happen). - try: - from llm.provider import PROVIDER_PREFIX_MAP, _PROVIDER_API_KEY_ENV, _PROVIDER_PIP_NAME - except ImportError as e: - return { - "status": "error", - "error": f"llm framework not importable: {e}", - } - - rows: List[Dict[str, Any]] = [] - for provider in sorted(PROVIDER_PREFIX_MAP): - prefixes = PROVIDER_PREFIX_MAP[provider] - env_vars = list(_PROVIDER_API_KEY_ENV.get(provider, ("CODELENS_LLM_API_KEY",))) - pip_name = _PROVIDER_PIP_NAME.get(provider, "?") - rows.append({ - "provider": provider, - "model_prefixes": list(prefixes), - "api_key_env_vars": env_vars, - "sdk_pip_name": pip_name, - }) - - if not json_output: - print("CodeLens LLM providers (issue #63 Phase 1):") - print() - for r in rows: - print(f" {r['provider']}") - print(f" prefixes: {', '.join(r['model_prefixes'])}") - print(f" api_key_env: {' or '.join(r['api_key_env_vars'])}") - print(f" sdk_install: pip install {r['sdk_pip_name']}") - print() - - return { - "status": "ok", - "providers": rows, - "count": len(rows), - } - - -def _cmd_config(*, json_output: bool) -> Dict[str, Any]: - """Show the currently resolved config (model, provider, key source).""" - try: - from llm.provider import ( - DEFAULT_MAX_RETRIES, - DEFAULT_TIMEOUT_SECONDS, - PROVIDER_PREFIX_MAP, - _PROVIDER_API_KEY_ENV, - resolve_provider, - ) - except ImportError as e: - return {"status": "error", "error": f"llm framework not importable: {e}"} - - model = os.environ.get("CODELENS_LLM_MODEL", "").strip() or None - forced_provider = os.environ.get("CODELENS_LLM_PROVIDER", "").strip().lower() or None - resolved_provider: Optional[str] = None - resolve_error: Optional[str] = None - if model: - try: - resolved_provider = forced_provider or resolve_provider(model) - except ValueError as e: - resolve_error = str(e) - elif forced_provider: - resolved_provider = forced_provider - - # Which env vars have values (do NOT print the values themselves). - key_sources: Dict[str, bool] = {} - if resolved_provider: - for var in _PROVIDER_API_KEY_ENV.get(resolved_provider, ("CODELENS_LLM_API_KEY",)): - key_sources[var] = bool(os.environ.get(var, "").strip()) - - config_block = { - "model": model or "(not set — set CODELENS_LLM_MODEL)", - "model_env_var": "CODELENS_LLM_MODEL", - "provider_forced": forced_provider or "(not set)", - "provider_forced_env_var": "CODELENS_LLM_PROVIDER", - "provider_resolved": resolved_provider or "(could not resolve)", - "provider_resolve_error": resolve_error, - "api_key_sources": key_sources, - "timeout_seconds_default": DEFAULT_TIMEOUT_SECONDS, - "max_retries_default": DEFAULT_MAX_RETRIES, - "known_providers": sorted(PROVIDER_PREFIX_MAP), - } - - if not json_output: - print("CodeLens LLM config (issue #63 Phase 1):") - print() - for k, v in config_block.items(): - print(f" {k}: {v}") - - return {"status": "ok", "config": config_block} - - -def _cmd_ping( - *, - model: Optional[str], - provider: Optional[str], - timeout: float, - json_output: bool, -) -> Dict[str, Any]: - """Send a 1-token smoke prompt to verify the chain end-to-end.""" - try: - from llm.provider import invoke_llm - from llm.base_tool import LLMError, ProviderNotConfiguredError, ProviderNotInstalledError - except ImportError as e: - return {"status": "error", "error": f"llm framework not importable: {e}"} - - smoke_prompt = "Reply with exactly: OK" - try: - raw, stats = invoke_llm( - prompt=smoke_prompt, - model=model, - provider=provider, - timeout_seconds=timeout, - max_retries=1, # ping should fail fast - ) - except ProviderNotConfiguredError as e: - msg = f"NOT CONFIGURED: {e} (provider={e.provider}, env_var={e.env_var})" - if not json_output: - print(msg, file=sys.stderr) - return {"status": "not_configured", "error": str(e), "provider": e.provider, "env_var": e.env_var} - except ProviderNotInstalledError as e: - msg = f"SDK MISSING: {e} (provider={e.provider}, install with: pip install {e.pip_name})" - if not json_output: - print(msg, file=sys.stderr) - return {"status": "sdk_missing", "error": str(e), "provider": e.provider, "pip_name": e.pip_name} - except LLMError as e: - if not json_output: - print(f"LLM ERROR: {e}", file=sys.stderr) - return {"status": "error", "error": str(e)} - except Exception as e: - if not json_output: - print(f"UNEXPECTED ERROR: {e}", file=sys.stderr) - return {"status": "error", "error": f"unexpected: {e}"} - - if not json_output: - print( - f"OK — {stats.provider}/{stats.model} " - f"({stats.attempts} attempt, {stats.elapsed_seconds:.2f}s)" - ) - return { - "status": "ok", - "provider": stats.provider, - "model": stats.model, - "attempts": stats.attempts, - "elapsed_seconds": round(stats.elapsed_seconds, 3), - "raw_preview": stats.raw_response_preview, - } - - -register_command( - "llm", - "LLM framework: list providers, show config, ping model (issue #63)", - add_args, - execute, -) diff --git a/scripts/commands/lsp.py b/scripts/commands/lsp.py index 471cd3b5..f26acf49 100644 --- a/scripts/commands/lsp.py +++ b/scripts/commands/lsp.py @@ -131,4 +131,5 @@ def execute(args, workspace): "Run CodeLens as a native LSP 3.17 server (stdio by default; --tcp for debug)", add_args, execute, +hidden=True, ) diff --git a/scripts/commands/lsp_status.py b/scripts/commands/lsp_status.py index f8449b4a..e2a0bed2 100644 --- a/scripts/commands/lsp_status.py +++ b/scripts/commands/lsp_status.py @@ -45,4 +45,6 @@ def execute(args, workspace): "Check which LSP servers are available for deep analysis", add_args, execute, +hidden=True, +deprecated_alias_for='doctor', ) diff --git a/scripts/commands/memory.py b/scripts/commands/memory.py deleted file mode 100644 index 4278ff16..00000000 --- a/scripts/commands/memory.py +++ /dev/null @@ -1,128 +0,0 @@ -"""Memory command — Serena-style markdown memory system (issue #60). - -Provides cross-session memory for AI agents using CodeLens. Memory files are -plain markdown stored under ``.codelens/memories/`` in the workspace (project -scope) or ``~/.codelens/memories/global/`` (global scope, read-only via CLI). - -Usage:: - - codelens memory write # create/update project memory - codelens memory read # read memory (project or global) - codelens memory list # list all memories - codelens memory delete # delete (project memory only) - -The storage layout, file header rules, and ``mem:NAME`` reference validation -live in :mod:`memories.memory_manager` — this module is a thin CLI wrapper. -""" - -from __future__ import annotations - -from typing import Any, Dict - -from commands import register_command - - -def add_args(parser): - """Add memory subcommand arguments to the parser.""" - sub = parser.add_subparsers(dest="memory_action", help="Memory action") - - # memory write - write = sub.add_parser( - "write", - help="Create or update a project memory file", - description=( - "Create or update a project memory file at " - ".codelens/memories/.md. The file is given a canonical " - "'# Memory: ' header. mem:NAME references in are " - "validated and warnings are emitted for any references that do " - "not exist in project or global scope — the write itself always " - "succeeds (issue #60: warn, don't block)." - ), - ) - write.add_argument("name", help="Memory topic name (e.g. 'auth-flow')") - write.add_argument( - "content", - help="Memory content (markdown). May include 'mem:NAME' references.", - ) - - # memory read - read = sub.add_parser( - "read", - help="Read a memory file (project or global)", - description=( - "Read a memory file. Looks in the project scope first, then " - "falls back to the global scope. Returns 'not_found' if the " - "memory doesn't exist in either scope." - ), - ) - read.add_argument("name", help="Memory topic name") - - # memory list - sub.add_parser( - "list", - help="List all memory files (project + global)", - description=( - "List all memory files in project and global scope. Project " - "memories shadow global memories of the same name." - ), - ) - - # memory delete - delete = sub.add_parser( - "delete", - help="Delete a project memory file", - description=( - "Delete a project memory file. Global memories are read-only " - "via CLI and cannot be deleted here — remove them manually from " - "~/.codelens/memories/global/ if needed." - ), - ) - delete.add_argument("name", help="Memory topic name") - - -def execute(args, workspace): - """Dispatch the memory subcommand.""" - action = getattr(args, "memory_action", None) - if not action: - return { - "status": "error", - "error": "No memory action specified. Use: write, read, list, delete", - "usage": "codelens memory [args]", - "examples": [ - "codelens memory write auth-flow 'Uses JWT, see mem:tokens'", - "codelens memory read auth-flow", - "codelens memory list", - "codelens memory delete auth-flow", - ], - } - - # Lazy import so a broken memory_manager never breaks command discovery. - from memories.memory_manager import ( - write_memory, - read_memory, - list_memories, - delete_memory, - ) - - if action == "write": - return write_memory(workspace, args.name, args.content) - if action == "read": - return read_memory(workspace, args.name) - if action == "list": - return list_memories(workspace) - if action == "delete": - return delete_memory(workspace, args.name) - - return { - "status": "error", - "error": f"Unknown memory action: {action}", - "available_actions": ["write", "read", "list", "delete"], - } - - -register_command( - "memory", - "Serena-style markdown memory system (write/read/list/delete)", - add_args, - execute, -) diff --git a/scripts/commands/migrate.py b/scripts/commands/migrate.py index d24b3fb9..96ce28a5 100644 --- a/scripts/commands/migrate.py +++ b/scripts/commands/migrate.py @@ -1,193 +1,69 @@ -"""Migrate command — Convert JSON registry to SQLite persistent registry.""" +"""migrate utility — JSON registry to SQLite (issue #195). -import os -from typing import Dict, Any +This module is NOT a registered command (``migrate`` was dropped per issue +#195). It is kept as a utility so that existing tests and scripts that +import ``cmd_migrate`` continue to work — the underlying migration logic +lives in ``PersistentRegistry.migrate_from_json``. +""" -from utils import logger -from persistent_registry import PersistentRegistry, db_exists, db_is_populated, is_sqlite_available -from commands import register_command +from typing import Any, Dict +from persistent_registry import PersistentRegistry -def add_args(parser): - """Add migrate-specific arguments to the parser.""" - parser.add_argument("workspace", nargs="?", default=None, - help="Path to workspace root (auto-detected if omitted)") - parser.add_argument("--db-path", default=None, - help="Custom path for the SQLite database file") - parser.add_argument("--verify", action="store_true", - help="Verify data integrity after migration") +def cmd_migrate(workspace: str, verify: bool = False) -> Dict[str, Any]: + """Migrate .codelens/ JSON files to SQLite. -def execute(args, workspace): - """Execute the migrate command.""" - db_path = getattr(args, 'db_path', None) - verify = getattr(args, 'verify', False) - return cmd_migrate(workspace, db_path=db_path, verify=verify) + Thin wrapper around ``PersistentRegistry.migrate_from_json`` that + preserves the legacy result shape (``{"status": ..., "migration": {...}}``) + expected by tests and older scripts. - -def cmd_migrate(workspace: str, db_path: str = None, verify: bool = False) -> Dict[str, Any]: - """Migrate from JSON-based registry to SQLite persistent registry. - - Steps: - 1. Check if SQLite is available - 2. Check if migration is needed (DB doesn't exist yet) - 3. Load data from JSON files - 4. Write data to SQLite - 5. Optionally verify data integrity - 6. Report results - - The migration is additive — it does NOT delete the JSON files, - ensuring backward compatibility and rollback capability. + Returns ``{"status": "error", ...}`` if no JSON registry exists or + if the SQLite registry is already populated. """ - workspace = os.path.abspath(workspace) - - # Step 1: Check SQLite availability - if not is_sqlite_available(): - return { - "status": "error", - "error": "SQLite is not available in this Python installation", - "suggestion": "Install Python 3 with sqlite3 module support", - } - - # Step 2: Check if migration is needed - effective_db_path = db_path or os.path.join(workspace, ".codelens", "codelens.db") - # Only skip if the db is actually populated. ``scan`` creates an empty - # db shell via ``store_scan_result`` (which writes to ``analysis_cache`` - # but not ``symbols``), so a bare ``os.path.exists`` check here would - # skip the real JSON→SQLite migration and leave ``symbols`` empty — - # which in turn makes ``_registry_exists`` return False after the JSON - # files are deleted (issue #35). - if db_is_populated(effective_db_path): - return { - "status": "ok", - "message": "SQLite database already exists and is populated. Use --verify to check integrity.", - "db_path": effective_db_path, - "hint": "To re-migrate, delete the .codelens/codelens.db file first.", - } - - # Step 3: Check if JSON files exist - from registry import get_codelens_dir - codelens_dir = get_codelens_dir(workspace) - frontend_path = os.path.join(codelens_dir, "frontend.json") - backend_path = os.path.join(codelens_dir, "backend.json") - - if not os.path.exists(frontend_path) and not os.path.exists(backend_path): - return { - "status": "error", - "error": "No JSON registry files found. Run 'scan' first to create a registry.", - "workspace": workspace, - } - - # Step 4: Perform migration try: - reg = PersistentRegistry(workspace, db_path=db_path) - reg._connect() - migration_result = reg.migrate_from_json() - stats = reg.get_stats() - reg.close() - except Exception as e: - logger.error(f"Migration failed: {e}", exc_info=True) - return { - "status": "error", - "error": f"Migration failed: {e}", - "workspace": workspace, - } - - # Step 5: Optional verification - verification = None - if verify: - verification = _verify_migration(workspace, effective_db_path) - - result = { - "status": "ok", - "message": "Migration from JSON to SQLite completed successfully", - "workspace": workspace, - "db_path": effective_db_path, - "migration": migration_result, - "stats": stats, - } - - if verification: - result["verification"] = verification - - return result - - -def _verify_migration(workspace: str, db_path: str) -> Dict[str, Any]: - """Verify that migrated data matches the JSON source.""" - from registry import load_frontend_registry, load_backend_registry - - reg = PersistentRegistry(workspace, db_path=db_path) - reg._connect() - - verification = {"checks": [], "all_passed": True} - - # Check frontend classes count - frontend = load_frontend_registry(workspace) - json_class_count = len(frontend.get("classes", [])) - db_class_count = len(reg.get_all_symbols(kind="class")) - check_passed = json_class_count == db_class_count - verification["checks"].append({ - "check": "frontend_classes_count", - "json_count": json_class_count, - "db_count": db_class_count, - "passed": check_passed, - }) - if not check_passed: - verification["all_passed"] = False - - # Check frontend IDs count - json_id_count = len(frontend.get("ids", [])) - db_id_count = len(reg.get_all_symbols(kind="id")) - check_passed = json_id_count == db_id_count - verification["checks"].append({ - "check": "frontend_ids_count", - "json_count": json_id_count, - "db_count": db_id_count, - "passed": check_passed, - }) - if not check_passed: - verification["all_passed"] = False - - # Check backend nodes count - backend = load_backend_registry(workspace) - json_node_count = len(backend.get("nodes", [])) - db_node_count = len(reg.get_all_symbols(kind="function")) - # Note: db_node_count may be larger due to frontend functions being - # stored as symbols too, so we check >= - check_passed = db_node_count >= json_node_count - verification["checks"].append({ - "check": "backend_nodes_count", - "json_count": json_node_count, - "db_count": db_node_count, - "passed": check_passed, - "note": "DB count may be higher due to other function symbols", - }) - if not check_passed: - verification["all_passed"] = False - - # Spot-check: lookup a specific symbol - if frontend.get("classes"): - sample_name = frontend["classes"][0].get("name", "") - if sample_name: - found = reg.lookup_symbol(sample_name, "class") - check_passed = len(found) > 0 - verification["checks"].append({ - "check": "symbol_lookup_spot_check", - "name": sample_name, - "found": check_passed, - "passed": check_passed, - }) - if not check_passed: - verification["all_passed"] = False - - reg.close() - return verification - - -register_command( - "migrate", - "Migrate JSON registry to SQLite persistent database", - add_args, - execute, -) + import os + from registry import get_codelens_dir + + pr = PersistentRegistry(workspace) + # Check for the "already populated" early-return condition FIRST. + # Only treat a populated db (existing symbol rows) as "already + # migrated" — an empty db shell created by scan() must NOT skip + # the real JSON→SQLite migration (issue #35 regression guard). + conn = pr._connect() + cursor = conn.execute("SELECT COUNT(*) FROM symbols") + existing = cursor.fetchone()[0] + # Do NOT close conn here — _connect() returns the persistent + # self._conn, and closing it would break subsequent migrate_from_json() + # calls. The connection is closed via pr.close() at the end. + if existing > 0: + pr.close() + return { + "status": "ok", + "message": "SQLite registry already exists and is populated; skipping migration.", + "migration": {"already_exists": True, "existing_rows": existing}, + } + + # Pre-check: error if no JSON registry files exist (legacy behavior). + # load_*_registry returns an empty dict even when the file is missing, + # so we check file existence directly. + codelens_dir = get_codelens_dir(workspace) + frontend_path = os.path.join(codelens_dir, "frontend.json") + backend_path = os.path.join(codelens_dir, "backend.json") + if not (os.path.isfile(frontend_path) or os.path.isfile(backend_path)): + pr.close() + return { + "status": "error", + "error": "No JSON registry found at .codelens/frontend.json or .codelens/backend.json. " + "Run 'codelens scan' first.", + } + + result = pr.migrate_from_json() + pr.close() + if result.get("status") != "ok": + return result + # Reshape to legacy {status, migration: {...}} form. + migration = {k: v for k, v in result.items() if k != "status"} + return {"status": "ok", "migration": migration} + except Exception as exc: + return {"status": "error", "error": str(exc)} diff --git a/scripts/commands/missingrefs.py b/scripts/commands/missingrefs.py index 67fcb4bb..b313128c 100644 --- a/scripts/commands/missingrefs.py +++ b/scripts/commands/missingrefs.py @@ -23,4 +23,4 @@ def execute(args, workspace): return detect_missing_refs(workspace) -register_command("missing-refs", "Detect CSS/HTML mismatch bugs", add_args, execute) +register_command("missing-refs", "Detect CSS/HTML mismatch bugs", add_args, execute, hidden=True) diff --git a/scripts/commands/orient.py b/scripts/commands/orient.py index 6a46f6d8..d357a7a7 100644 --- a/scripts/commands/orient.py +++ b/scripts/commands/orient.py @@ -397,4 +397,6 @@ def _render_text(brief: Dict[str, Any]) -> None: "start-here files, CI/Docker)", add_args, execute, +hidden=True, +deprecated_alias_for='context', ) diff --git a/scripts/commands/outline.py b/scripts/commands/outline.py index be99f43f..c1fcd9a4 100644 --- a/scripts/commands/outline.py +++ b/scripts/commands/outline.py @@ -42,4 +42,10 @@ def execute(args, workspace): return result -register_command("outline", "Get file structure outline", add_args, execute) +register_command("outline", "Get file structure outline", add_args, execute, + +hidden=True, + +deprecated_alias_for='context', + +) diff --git a/scripts/commands/ownership.py b/scripts/commands/ownership.py index 4cfd8a73..cef3ae6a 100644 --- a/scripts/commands/ownership.py +++ b/scripts/commands/ownership.py @@ -20,4 +20,10 @@ def execute(args, workspace): ) -register_command("ownership", "Git blame-based code ownership", add_args, execute) +register_command("ownership", "Git blame-based code ownership", add_args, execute, + +hidden=True, + +deprecated_alias_for='history', + +) diff --git a/scripts/commands/perf_hint.py b/scripts/commands/perf_hint.py index f94c3487..08f475d9 100644 --- a/scripts/commands/perf_hint.py +++ b/scripts/commands/perf_hint.py @@ -20,4 +20,10 @@ def execute(args, workspace): max_files=args.max_files) -register_command("perf-hint", "Detect performance anti-patterns", add_args, execute) +register_command("perf-hint", "Detect performance anti-patterns", add_args, execute, + +hidden=True, + +deprecated_alias_for='audit', + +) diff --git a/scripts/commands/plugin.py b/scripts/commands/plugin.py index 0370c397..9a3d1e17 100644 --- a/scripts/commands/plugin.py +++ b/scripts/commands/plugin.py @@ -338,4 +338,5 @@ def _extract_plugin_name_from_source(source: str) -> Optional[str]: "Manage CodeLens plugins (install, list, search, update, info, validate)", add_args, execute, +hidden=True, ) diff --git a/scripts/commands/query.py b/scripts/commands/query.py index 16cd2e13..c8367c17 100644 --- a/scripts/commands/query.py +++ b/scripts/commands/query.py @@ -451,4 +451,8 @@ def fuzzy_sort_key(match): } -register_command("query", "Query a specific class/id/function", add_args, execute) +register_command("query", "Query a specific class/id/function", add_args, execute, + +hidden=True, + +) diff --git a/scripts/commands/query_graph.py b/scripts/commands/query_graph.py index e749155f..09f76c2d 100644 --- a/scripts/commands/query_graph.py +++ b/scripts/commands/query_graph.py @@ -82,4 +82,6 @@ def execute(args, workspace): "Query the code graph with a Cypher-subset query (MATCH/WHERE/RETURN/LIMIT)", add_args, execute, +hidden=True, +deprecated_alias_for='graph', ) diff --git a/scripts/commands/refactor_safe.py b/scripts/commands/refactor_safe.py deleted file mode 100644 index 383229f4..00000000 --- a/scripts/commands/refactor_safe.py +++ /dev/null @@ -1,24 +0,0 @@ -"""Refactor-safe command — Pre-flight rename/move safety check.""" - -from refactor_safe_engine import check_refactor_safety -from commands import register_command - - -def add_args(parser): - parser.add_argument("name", help="Symbol name to rename/move") - parser.add_argument("workspace", nargs="?", default=None, - help="Path to workspace root (auto-detected if omitted)") - parser.add_argument("--action", choices=["rename", "move"], default="rename", - help="Action type (rename or move)") - parser.add_argument("--new-name", default=None, help="New name (for rename) or new path (for move)") - - -def execute(args, workspace): - return check_refactor_safety( - args.name, workspace, - action=args.action, - new_name=args.new_name - ) - - -register_command("refactor-safe", "Pre-flight rename/move safety check", add_args, execute) diff --git a/scripts/commands/regex_audit.py b/scripts/commands/regex_audit.py index 59030b6a..c8478472 100644 --- a/scripts/commands/regex_audit.py +++ b/scripts/commands/regex_audit.py @@ -17,4 +17,10 @@ def execute(args, workspace): return audit_regex_patterns(workspace, severity=args.severity, max_files=args.max_files) -register_command("regex-audit", "Audit regex for ReDoS and issues", add_args, execute) +register_command("regex-audit", "Audit regex for ReDoS and issues", add_args, execute, + +hidden=True, + +deprecated_alias_for='security', + +) diff --git a/scripts/commands/registry_validate.py b/scripts/commands/registry_validate.py deleted file mode 100644 index b3c9ac93..00000000 --- a/scripts/commands/registry_validate.py +++ /dev/null @@ -1,44 +0,0 @@ -"""registry-validate command — Validate registry against file system. - -Renamed from `validate` in v8.x to make room for `rule-validate` (rule YAML -validation). The deprecated `validate` alias was removed in issue #100 — -use `registry-validate` for registry checks, or `rule-validate` for rule -YAML validation. -""" - -import sys - -from validate_engine import validate_registry -from commands import register_command - - -def add_args(parser): - """Register registry-validate arguments.""" - parser.add_argument( - "workspace", - nargs="?", - default=None, - help="Path to workspace root (auto-detected if omitted)", - ) - - -def execute(args, workspace): - """Execute the registry-validate command. - - Args: - args: Parsed argparse namespace with ``workspace``. - workspace: Resolved workspace root path. - - Returns: - Dict with the registry validation result (``validate_registry`` - return shape). - """ - return validate_registry(workspace) - - -register_command( - "registry-validate", - "Validate registry against file system", - add_args, - execute, -) diff --git a/scripts/commands/resolve_types.py b/scripts/commands/resolve_types.py deleted file mode 100644 index 3e78ca1f..00000000 --- a/scripts/commands/resolve_types.py +++ /dev/null @@ -1,94 +0,0 @@ -"""resolve-types command — manually trigger hybrid type resolution (issue #13). - -Runs the hybrid type resolution pass without a full re-scan. Useful for AI -agents who want to refresh type resolution after adding new imports or -modifying class definitions, without paying the cost of a full scan. - -Example output:: - - { - "status": "ok", - "workspace": "/path/to/proj", - "edges_total": 97, - "edges_refined": 11, - "edges_unresolved": 55, - "import_registry_size": 37 - } -""" - -import os -from typing import Any, Dict, Optional - -from commands import register_command - - -def add_args(parser): - """Add resolve-types arguments to the parser.""" - parser.add_argument( - "workspace", - nargs="?", - default=None, - help="Path to workspace root (auto-detected if omitted)", - ) - parser.add_argument( - "--db-path", - default=None, - help="Custom path for SQLite database file", - ) - - -def _default_db_path(workspace: str) -> str: - """Return the default SQLite db path for a workspace.""" - return os.path.join(workspace, ".codelens", "codelens.db") - - -def execute(args, workspace): - """Execute the resolve-types command. - - Runs ``hybrid_type_resolver.refine_call_edges`` on the workspace and - returns a stats dict. If the database doesn't exist (pre-scan), the - command auto-runs a full scan first so the type resolver has graph - tables to work with. - """ - db_path = getattr(args, "db_path", None) or _default_db_path(workspace) - - if not os.path.exists(db_path): - # Auto-scan so the type resolver has graph tables to read. - try: - from commands.scan import cmd_scan - cmd_scan(workspace, incremental=False) - except Exception: # noqa: BLE001 — fail-soft - return { - "status": "error", - "error": "auto-scan failed; run 'scan' manually", - "workspace": workspace, - } - - from hybrid_type_resolver import refine_call_edges, import_registry_size - - try: - stats = refine_call_edges(workspace, db_path) - except Exception as exc: # noqa: BLE001 — best-effort, never crash - return { - "status": "error", - "error": str(exc), - "workspace": workspace, - } - - return { - "status": "ok", - "workspace": workspace, - "edges_total": stats["edges_total"], - "edges_refined": stats["edges_refined"], - "edges_unresolved": stats["edges_unresolved"], - "import_registry_size": import_registry_size(db_path), - } - - -register_command( - "resolve-types", - "Run hybrid type resolution: refine CALLS edges with import-aware " - "receiver types. Auto-scans if graph tables are empty.", - add_args, - execute, -) diff --git a/scripts/commands/rule_test.py b/scripts/commands/rule_test.py deleted file mode 100644 index cb3bf91f..00000000 --- a/scripts/commands/rule_test.py +++ /dev/null @@ -1,183 +0,0 @@ -"""rule-test command — snapshot testing for rule YAML files. - -Runs a rule against positive/negative code samples (``.test.yaml``) and -verifies the rule fires (or doesn't fire) where expected via inline -``# ruleid: `` / ``# ok`` markers. All logic lives in -``scripts/rule_test_runner.py``; this file is the thin CLI wrapper. - -Usage:: - - codelens rule-test tests/rule_fixtures/py_sql_injection.yaml - codelens rule-test tests/rule_fixtures/ # run all rules in a dir - codelens rule-test --json tests/rule_fixtures/ - codelens rule-test --test-ignore-todo tests/rule_fixtures/ - -Exit codes: - 0 — all tests pass (or no tests ran) - 1 — at least one test failed or errored -""" - -from __future__ import annotations - -import json -import os -import sys -from pathlib import Path -from typing import Any, Dict, List - -from commands import register_command -from rule_test_runner import ( - TestResult, - determine_exit_code, - run_tests, - run_tests_recursive, -) - - -def add_args(parser): - """Register rule-test CLI arguments.""" - parser.add_argument( - "rule_path", - help="Path to a rule YAML file or a directory of rule files", - ) - parser.add_argument( - "--test-ignore-todo", - action="store_true", - default=False, - help="Skip '# todoruleid:' markers (staged rules not yet enforced)", - ) - parser.add_argument( - "--json", - dest="json_output", - action="store_true", - default=False, - help="Output machine-readable JSON instead of human-readable text", - ) - - -def _format_human(results: List[TestResult]) -> str: - """Render test results as human-readable text. - - One block per rule: ``: PASS (3/3 samples)`` or fail with - a per-failure diff. Ends with a summary line. - """ - lines: List[str] = [] - total_pass = sum(1 for r in results if r.is_pass) - total_fail = sum(1 for r in results if not r.is_pass) - total_samples = sum(r.total for r in results) - total_passed_samples = sum(r.passed for r in results) - total_skipped = sum(r.skipped for r in results) - - for result in results: - rule_id = result.rule_id or Path(result.rule_path).stem - if result.error: - lines.append(f"\n{rule_id}: ERROR — {result.error}") - continue - - if result.total == 0: - lines.append(f"\n{rule_id}: SKIP (no samples)") - continue - - # Per-rule verdict line — the most important line for CI parsers. - verdict = "PASS" if result.is_pass else "FAIL" - sample_summary = f"{result.passed}/{result.total} samples" - if result.skipped: - sample_summary += f" ({result.skipped} skipped)" - lines.append(f"\n{rule_id}: {verdict} ({sample_summary})") - - # Per-failure detail so authors can fix the rule. - for failure in result.failures: - lines.append(f" ✗ {failure.sample_name} line {failure.line}: {failure.message}") - - # Summary line. - lines.append("\n" + "=" * 60) - if total_fail > 0: - lines.append( - f"FAIL: {total_fail}/{len(results)} rule(s) failed, " - f"{total_passed_samples}/{total_samples} samples passed " - f"({total_skipped} skipped)" - ) - else: - lines.append( - f"PASS: {total_pass}/{len(results)} rule(s), " - f"{total_passed_samples}/{total_samples} samples passed " - f"({total_skipped} skipped)" - ) - - return "\n".join(lines) - - -def _format_json(results: List[TestResult]) -> str: - """Render test results as JSON for CI / programmatic consumers.""" - payload: Dict[str, Any] = { - "status": "ok" if all(r.is_pass for r in results) else "fail", - "exit_code": determine_exit_code(results), - "total_rules": len(results), - "total_pass": sum(1 for r in results if r.is_pass), - "total_fail": sum(1 for r in results if not r.is_pass), - "total_samples": sum(r.total for r in results), - "total_passed_samples": sum(r.passed for r in results), - "total_skipped": sum(r.skipped for r in results), - "results": [r.to_dict() for r in results], - } - return json.dumps(payload, indent=2) - - -def execute(args, workspace): - """Execute the rule-test command. - - Returns a dict (so the result flows through the standard CodeLens - output formatter) AND sets the process exit code via ``sys.exit`` so - CI pipelines get the correct 0/1 signal. - - Args: - args: Parsed argparse namespace with ``rule_path``, ``test_ignore_todo``, - and ``json_output``. - workspace: Workspace root (unused — rule-test is path-based). - - Returns: - Dict with ``status``, ``exit_code``, ``results``, and the rendered - ``output`` string (human or JSON). - """ - raw_path = os.path.expanduser(args.rule_path) - path = Path(raw_path).resolve() - - if not path.exists(): - # Surface a clear error rather than crashing — the path may be a - # typo, and the user benefits from an actionable message. - print(f"Error: path does not exist: {path}", file=sys.stderr) - sys.exit(1) - - # A single file → run tests for that one rule. A directory → walk and - # run tests for every rule with a ``.test.yaml`` companion. - if path.is_file(): - results = [run_tests(path, ignore_todo=args.test_ignore_todo)] - else: - results = run_tests_recursive(path, ignore_todo=args.test_ignore_todo) - - exit_code = determine_exit_code(results) - - if args.json_output: - output = _format_json(results) - else: - output = _format_human(results) - - print(output) - sys.exit(exit_code) - - # Unreachable, but keeps the return-type contract honest for callers - # that import ``execute`` directly (e.g., tests). - return { - "status": "ok" if exit_code == 0 else "fail", - "exit_code": exit_code, - "results": [r.to_dict() for r in results], - "output": output, - } - - -register_command( - "rule-test", - "Run snapshot tests for rule YAML files (inline # ruleid: / # ok markers)", - add_args, - execute, -) diff --git a/scripts/commands/rule_validate.py b/scripts/commands/rule_validate.py deleted file mode 100644 index 18f7e818..00000000 --- a/scripts/commands/rule_validate.py +++ /dev/null @@ -1,221 +0,0 @@ -"""rule-validate command — validate rule YAML files for typos and schema errors. - -Catches the silent-skip class of bugs: typos (``pattern-eiter`` vs -``pattern-either``), unknown keys, missing required fields, invalid -``severity`` enum, unparseable ``pattern`` strings, and cross-field -violations (``pattern`` + ``patterns`` mutually exclusive, ``fix`` requires -``pattern``). All logic lives in ``scripts/rule_validator.py``; this file -is the thin CLI wrapper. - -Exit codes: - 0 — all rules valid (no errors, no warnings without ``--strict``) - 1 — at least one rule has an error - 2 — at least one rule has a warning AND ``--strict`` is set - -Usage:: - - codelens rule-validate scripts/rules/python_security.yaml - codelens rule-validate --strict scripts/rules/*.yaml - codelens rule-validate --json scripts/rules/python_security.yaml - codelens rule-validate scripts/rules/ # validate every rule file in a directory -""" - -from __future__ import annotations - -import json -import os -import sys -from pathlib import Path -from typing import Any, Dict, List - -from commands import register_command -from rule_validator import ( - ValidationResult, - determine_exit_code, - validate_rule, - validate_rule_files, -) - -# Exit codes — kept as named constants so the command and its tests agree. -EXIT_OK = 0 -EXIT_ERROR = 1 -EXIT_WARNING_STRICT = 2 - - -def add_args(parser): - """Register rule-validate CLI arguments.""" - parser.add_argument( - "rule_path", - nargs="+", - help="Path(s) to rule YAML file(s) to validate, or directory(ies) of rule files", - ) - parser.add_argument( - "--strict", - action="store_true", - default=False, - help="Treat warnings as errors for exit-code purposes (exit 2 instead of 0)", - ) - parser.add_argument( - "--json", - dest="json_output", - action="store_true", - default=False, - help="Output machine-readable JSON instead of human-readable text", - ) - - -def _format_human(results: List[ValidationResult], strict: bool) -> str: - """Render validation results as human-readable text. - - One block per rule file: header line, then errors (✗) and warnings (⚠), - each with file:line: message. Ends with a summary line. - """ - lines: List[str] = [] - total_errors = sum(len(r.errors) for r in results) - total_warnings = sum(len(r.warnings) for r in results) - total_rules = sum(r.rules_checked for r in results) - valid_files = sum(1 for r in results if r.is_valid) - - for result in results: - # Header: file path + ✓/✗ status badge. - status = "✓ valid" if result.is_valid else "✗ invalid" - lines.append(f"\n{result.rule_path} — {status} ({result.rules_checked} rules)") - lines.append("─" * 60) - - if not result.errors and not result.warnings: - lines.append(" No issues found.") - continue - - # Errors first (always surface the most important issues up top). - for issue in result.errors: - loc = f"line {issue.line}: " if issue.line else "" - lines.append(f" ✗ [{issue.category}] {loc}{issue.message}") - - # Then warnings. - for issue in result.warnings: - loc = f"line {issue.line}: " if issue.line else "" - lines.append(f" ⚠ [{issue.category}] {loc}{issue.message}") - - # Summary line — gives CI parsers and humans a one-line verdict. - lines.append("\n" + "=" * 60) - if total_errors > 0: - lines.append( - f"FAIL: {total_errors} error(s), {total_warnings} warning(s) " - f"across {len(results)} file(s), {total_rules} rule(s)" - ) - elif total_warnings > 0 and strict: - lines.append( - f"FAIL (--strict): {total_warnings} warning(s) treated as errors " - f"across {len(results)} file(s), {total_rules} rule(s)" - ) - elif total_warnings > 0: - lines.append( - f"PASS with warnings: {valid_files}/{len(results)} file(s) valid, " - f"{total_warnings} warning(s) (use --strict to fail on warnings)" - ) - else: - lines.append( - f"PASS: {len(results)}/{len(results)} file(s) valid, {total_rules} rule(s) checked" - ) - - return "\n".join(lines) - - -def _format_json(results: List[ValidationResult], strict: bool) -> str: - """Render validation results as JSON for CI / programmatic consumers.""" - payload: Dict[str, Any] = { - "status": "ok" if all(r.is_valid for r in results) else "error", - "strict": strict, - "exit_code": determine_exit_code(results, strict=strict), - "total_files": len(results), - "total_rules": sum(r.rules_checked for r in results), - "total_errors": sum(len(r.errors) for r in results), - "total_warnings": sum(len(r.warnings) for r in results), - "results": [r.to_dict() for r in results], - } - return json.dumps(payload, indent=2) - - -def execute(args, workspace): - """Execute the rule-validate command. - - Returns a dict (so the result flows through the standard CodeLens - output formatter) AND sets the process exit code via ``sys.exit`` so - CI pipelines get the correct 0/1/2 signal. - - Args: - args: Parsed argparse namespace with ``rule_path`` (list), - ``strict`` (bool), and ``json_output`` (bool). - workspace: Workspace root (unused — rule-validate is path-based). - - Returns: - Dict with ``status``, ``exit_code``, ``results``, and the rendered - ``output`` string (human or JSON). - """ - # Expand and deduplicate paths. ``args.rule_path`` is a list (nargs="+"). - # Each entry may be either a single rule file OR a directory of rule files - # (issue #97): when a directory is given, enumerate the ``*.yaml``/``*.yml`` - # rule files inside it rather than trying to ``open()`` the directory - # itself — ``read_text()`` on a directory raises ``IsADirectoryError`` on - # Linux/macOS and ``PermissionError`` on Windows. - paths: List[Path] = [] - seen: set = set() - for raw in args.rule_path: - # Expand ``~`` and resolve to absolute. We don't follow symlinks - # here — a missing file is reported as a validation error below. - p = Path(os.path.expanduser(raw)).resolve() - - if p.is_dir(): - # Directory → enumerate rule files inside it (recursive, matching - # rule-test's behavior). Skip ``.test.yaml``/``.test.yml`` (test - # fixtures with a different schema) and hidden/dotfiles. - for entry in sorted(p.rglob("*.y*ml")): - if entry.name.endswith((".test.yaml", ".test.yml")): - continue - if entry.name.startswith("."): - continue - if entry in seen: - continue - seen.add(entry) - paths.append(entry) - else: - if p in seen: - continue - seen.add(p) - paths.append(p) - - # Validate each path. Missing files produce a single-error result - # rather than crashing — the validator's ``_parse_yaml`` already - # handles ``OSError`` and records it as a yaml_syntax error. - results = validate_rule_files(paths) - - exit_code = determine_exit_code(results, strict=args.strict) - - if args.json_output: - output = _format_json(results, args.strict) - else: - output = _format_human(results, args.strict) - - # Print to stdout so the report is pipeable, then exit with the - # contract code. We use ``sys.exit`` from inside the command (rather - # than returning a sentinel) because rule-validate is fundamentally a - # CI gate — the exit code IS the result. - print(output) - sys.exit(exit_code) - - # Unreachable, but keeps the return-type contract honest for callers - # that import ``execute`` directly (e.g., tests). - return { - "status": "ok" if exit_code == 0 else "error", - "exit_code": exit_code, - "results": [r.to_dict() for r in results], - "output": output, - } - - -register_command( - "rule-validate", - "Validate rule YAML files for typos, schema errors, and unparseable patterns", - add_args, - execute, -) diff --git a/scripts/commands/scan.py b/scripts/commands/scan.py index 9a68a799..48933b36 100644 --- a/scripts/commands/scan.py +++ b/scripts/commands/scan.py @@ -79,6 +79,11 @@ def add_args(parser): ) parser.add_argument("workspace", nargs="?", default=None, help="Path to workspace root (auto-detected if omitted)") + parser.add_argument("--check", default=None, + help="Issue #195: comma-separated sub-analyses. " + "Choices: scan, init, rescan. Default: scan. " + "'init' writes .codelens config only. " + "'rescan' is an alias for scan --incremental.") parser.add_argument("--incremental", action="store_true", help="Only re-scan changed files") parser.add_argument("--plugins", nargs="*", default=None, @@ -130,8 +135,80 @@ def add_args(parser): "Additive — does not change default scan behavior.") +# Issue #195: sub-command dispatch table for the scan umbrella. +_SCAN_CHECKS = {"scan", "init", "rescan"} + + +def _dispatch_check(args, workspace, check_arg): + """Dispatch scan sub-analyses (init / rescan / scan) per --check. + + - ``scan`` : full scan (default; same as no --check) + - ``init`` : write .codelens config only (no scan) + - ``rescan`` : incremental scan (alias for scan --incremental) + """ + import sys + parts = [c.strip() for c in check_arg.split(",") if c.strip()] + invalid = [p for p in parts if p not in _SCAN_CHECKS] + if invalid: + print( + f"[CodeLens] scan: unknown --check category '{','.join(invalid)}'. " + f"Valid: {', '.join(sorted(_SCAN_CHECKS))}", + file=sys.stderr, + ) + sys.exit(1) + if not parts: + parts = ["scan"] + + results = [] + checks_failed = 0 + for check_name in parts: + try: + if check_name == "init": + from commands.init import cmd_init + sub_result = cmd_init(workspace) + elif check_name == "rescan": + sub_result = cmd_scan(workspace, incremental=True, + plugins=getattr(args, "plugins", None), + max_files=getattr(args, "max_files", None), + use_prefilter=getattr(args, "use_prefilter", True), + verbose=getattr(args, "verbose", False), + scan_stats=getattr(args, "scan_stats", False)) + else: # scan + sub_result = cmd_scan(workspace, getattr(args, "incremental", False), + plugins=getattr(args, "plugins", None), + max_files=getattr(args, "max_files", None), + use_prefilter=getattr(args, "use_prefilter", True), + verbose=getattr(args, "verbose", False), + scan_stats=getattr(args, "scan_stats", False)) + if not isinstance(sub_result, dict): + sub_result = {"status": "ok", "result": sub_result} + sub_result["_check"] = check_name + results.append(sub_result) + except Exception as exc: + checks_failed += 1 + results.append({ + "_check": check_name, + "s": "error", + "error": str(exc), + "error_type": type(exc).__name__, + }) + print(f"[CodeLens] scan: --check {check_name} failed: {exc}", + file=sys.stderr) + + return { + "s": "ok" if checks_failed == 0 else "partial", + "st": {"checks_requested": len(parts), "checks_failed": checks_failed}, + "r": results, + } + + def execute(args, workspace): """Execute the scan command.""" + # Issue #195: --check dispatch for init / rescan sub-analyses. + check_arg = getattr(args, "check", None) + if check_arg: + return _dispatch_check(args, workspace, check_arg) + # --suggest-ignore short-circuits the normal scan flow. if getattr(args, 'suggest_ignore', False): return _run_suggest_ignore(workspace) diff --git a/scripts/commands/search.py b/scripts/commands/search.py index f33aca7d..9b3496cd 100644 --- a/scripts/commands/search.py +++ b/scripts/commands/search.py @@ -1,32 +1,122 @@ -"""Search command — Search code pattern across workspace.""" +"""search command — unified symbol/semantic/graph search (issue #195 consolidation). +Umbrella command that absorbs: + - symbols (exact symbol name lookup) + - semantic-query (TF-IDF semantic search by meaning) + - query-graph (Cypher-subset graph query) + - search (regex code search — the legacy behavior) + +Default mode is **semantic** (find symbols by meaning). Switch via --mode: + + codelens search "google auth" # semantic + codelens search "google auth" --mode symbol # exact name + codelens search "google auth" --mode regex # regex code + codelens search "MATCH (n) WHERE n.id CONTAINS x" --mode graph + +For raw Cypher pass-through (power user), prefer ``codelens graph ``. + +Output: ``{"s":"ok", "st":{...}, "r":[...]}`` shape. +""" + +import argparse import os -from search_engine import search_workspace -from registry import load_config +import sys +from typing import Any, Dict + from commands import register_command +_MODES = ("semantic", "symbol", "regex", "graph") + + def add_args(parser): - parser.add_argument("pattern", help="Regex pattern to search for") + """Add search-specific arguments to the parser.""" + parser.formatter_class = argparse.RawDescriptionHelpFormatter + parser.epilog = ( + "Search modes (issue #195):\n" + " semantic TF-IDF semantic search by meaning (default)\n" + " symbol Exact symbol name lookup (fuzzy optional)\n" + " regex Regex code search across workspace files\n" + " graph Cypher-subset graph query (MATCH/WHERE/RETURN/LIMIT)\n" + "\n" + "Examples:\n" + " codelens search . \"google auth\" # semantic (default)\n" + " codelens search . \"google auth\" --mode symbol # exact symbol\n" + " codelens search . \"handleChange\" --mode regex # regex code search\n" + " codelens search . \"MATCH (n) WHERE n.id CONTAINS x\" --mode graph\n" + "\n" + "For raw Cypher pass-through, prefer ``codelens graph ``." + ) + parser.add_argument("pattern", help="Search query (semantic query, symbol name, regex, or Cypher)") parser.add_argument("workspace", nargs="?", default=None, help="Path to workspace root (auto-detected if omitted)") + parser.add_argument("--mode", default="semantic", choices=_MODES, + help=f"Search mode. Choices: {', '.join(_MODES)}. Default: semantic.") + # Regex-mode passthroughs (legacy `search` command flags). parser.add_argument("--type", dest="file_type", default=None, - help="File type filter (html, css, js, ts, tsx, rust, python, vue, svelte)") - parser.add_argument("--file", default=None, help="Filter by file path substring") - parser.add_argument("--max-results", type=int, default=200, help="Max results (default 200)") - parser.add_argument("--context", type=int, default=0, help="Context lines around match") - parser.add_argument("--ignore-case", action="store_true", help="Case-insensitive search") - parser.add_argument("--whole-word", action="store_true", help="Match whole words only") - parser.add_argument("--limit", type=int, default=20, - help="Max matches to return after pagination (default: 20). " - "Use --limit 0 for unlimited (still bounded by --max-results).") + help="regex mode: file type filter (html, css, js, ts, tsx, rust, python, vue, svelte)") + parser.add_argument("--file", default=None, + help="regex mode: filter by file path substring") + parser.add_argument("--max-results", type=int, default=200, + help="regex mode: max results (default 200)") + parser.add_argument("--context", type=int, default=0, + help="regex mode: context lines around match") + parser.add_argument("--ignore-case", action="store_true", + help="regex mode: case-insensitive search") + parser.add_argument("--whole-word", action="store_true", + help="regex mode: match whole words only") + # Symbol-mode passthroughs. + parser.add_argument("--domain", default=None, + help="symbol mode: frontend|backend|all (default all)") + parser.add_argument("--fuzzy", action="store_true", default=False, + help="symbol mode: fuzzy name matching") + # Semantic-mode passthroughs. + parser.add_argument("--top", type=int, default=None, metavar="N", + help="semantic/symbol mode: limit to top N results") + # Graph-mode passthroughs. + parser.add_argument("--validate", action="store_true", default=False, + help="graph mode: validate query without executing") + parser.add_argument("--limit", type=int, default=None, + help="Result limit (applies in all modes)") parser.add_argument("--offset", type=int, default=0, - help="Offset for pagination (default: 0)") + help="regex/symbol mode: pagination offset (default: 0)") + parser.add_argument("--db-path", default=None, + help="Custom SQLite database path (semantic/graph modes)") -def execute(args, workspace): +def _run_semantic(args, workspace) -> Dict[str, Any]: + from semantic_search_engine import semantic_query + top_k = getattr(args, "top", None) or 10 + result = semantic_query(workspace=workspace, query=args.pattern, + top_k=top_k, db_path=getattr(args, "db_path", None)) + return result + + +def _run_symbol(args, workspace) -> Dict[str, Any]: + from symbols_engine import search_symbols + domain = getattr(args, "domain", None) or "all" + fuzzy = getattr(args, "fuzzy", False) + limit = getattr(args, "limit", None) or 20 + offset = getattr(args, "offset", 0) + result = search_symbols(workspace, args.pattern, domain=domain, + fuzzy=fuzzy, max_results=500) + if isinstance(result, dict) and "results" in result: + items = result["results"] + total = len(items) + end = offset + (limit if limit and limit > 0 else total) + result["results"] = items[offset:end] + result["total_count"] = total + result["count"] = len(result["results"]) + result["offset"] = offset + result["limit"] = limit + result["has_more"] = end < total + return result + + +def _run_regex(args, workspace) -> Dict[str, Any]: + from search_engine import search_workspace + from registry import load_config config = load_config(os.path.abspath(workspace)) - # Resolve limit: --top is an alias for --limit (per issue #17 spec). top_n = getattr(args, 'top', None) if top_n is not None and getattr(args, 'limit', None) is None: args.limit = top_n @@ -38,9 +128,8 @@ def execute(args, workspace): context_lines=args.context, case_sensitive=not args.ignore_case, whole_word=args.whole_word, - config=config + config=config, ) - # Apply pagination to matches (issue #17). if isinstance(result, dict) and "matches" in result: matches = result["matches"] total = len(matches) @@ -56,4 +145,68 @@ def execute(args, workspace): return result -register_command("search", "Search code pattern across workspace", add_args, execute) +def _run_graph(args, workspace) -> Dict[str, Any]: + from commands.query_graph import execute as _qg_execute + # query_graph.execute reads: query, workspace, limit, validate, db_path + sub_args = argparse.Namespace( + query=args.pattern, + workspace=getattr(args, "workspace", None), + limit=getattr(args, "limit", None), + validate=getattr(args, "validate", False), + db_path=getattr(args, "db_path", None), + format=getattr(args, "format", None), + top=None, max_tokens=None, lite=False, deep=False, + diff_base=None, diff_scope=None, + disable_suppression=None, codelens_ignore_pattern=None, + ) + return _qg_execute(sub_args, workspace) + + +def execute(args, workspace): + """Dispatch to the selected search mode and normalize output shape. + + @FLOW: SEARCH_DISPATCH + @CALLS: _run_semantic() | _run_symbol() | _run_regex() | _run_graph() -> dict + @MUTATES: nothing (read-only) + """ + mode = getattr(args, "mode", "semantic") or "semantic" + try: + if mode == "semantic": + result = _run_semantic(args, workspace) + elif mode == "symbol": + result = _run_symbol(args, workspace) + elif mode == "regex": + result = _run_regex(args, workspace) + elif mode == "graph": + result = _run_graph(args, workspace) + else: + return {"s": "error", "st": {"mode": mode}, "r": [], + "error": f"unknown mode '{mode}'"} + except Exception as exc: + return {"s": "error", "st": {"mode": mode}, + "r": [], "error": str(exc), + "error_type": type(exc).__name__} + + # Normalize to {s, st, r} shape while preserving original payload. + if not isinstance(result, dict): + return {"s": "ok", "st": {"mode": mode}, "r": [{"result": result}]} + status = result.pop("status", "ok") + # Move large payload lists into ``r`` if present, keep stats in ``st``. + rows = None + for key in ("matches", "results", "rows", "findings"): + if key in result and isinstance(result[key], list): + rows = result.pop(key) + break + return { + "s": status, + "st": {"mode": mode, **result}, + "r": rows if rows is not None else [], + } + + +register_command( + "search", + "Unified search: semantic (default) / symbol / regex / graph (issue #195)", + add_args, + execute, +) diff --git a/scripts/commands/secrets.py b/scripts/commands/secrets.py index 8e6e4b98..72cac4fd 100644 --- a/scripts/commands/secrets.py +++ b/scripts/commands/secrets.py @@ -86,4 +86,10 @@ def execute(args, workspace): return result -register_command("secrets", "Detect hardcoded secrets and API keys", add_args, execute) +register_command("secrets", "Detect hardcoded secrets and API keys", add_args, execute, + +hidden=True, + +deprecated_alias_for='security', + +) diff --git a/scripts/commands/security.py b/scripts/commands/security.py new file mode 100644 index 00000000..8e7c1f1f --- /dev/null +++ b/scripts/commands/security.py @@ -0,0 +1,210 @@ +"""security command — vulnerability & secret scanning (issue #195 consolidation). + +Umbrella command that absorbs: + - secrets Hardcoded secrets and API keys + - vuln-scan Dependency CVE scan (OSV.dev + native audit) + - taint AST-based taint analysis + - binary-scan Binary/compiled artifact reverse-engineering + - regex-audit Regex ReDoS and issue audit + +Usage: + codelens security # all checks + codelens security --check secrets # only secrets + codelens security --check taint,vuln-scan # pick subset + codelens security --check binary-scan --deep + +Output: ``{"s":"ok", "st":{...}, "r":[...]}``. +""" + +# @WHO: scripts/commands/security.py +# @WHAT: Umbrella command for security/vulnerability scans. +# @PART: commands +# @ENTRY: execute() + +import argparse +import importlib +import sys +from typing import Any, Dict, List + +from commands import register_command + + +_CHECKS = { + "secrets": { + "module": "commands.secrets", + "help": "Hardcoded secrets and API keys", + }, + "vuln-scan": { + "module": "commands.vuln_scan", + "help": "Dependency CVE scan (OSV.dev + native audit)", + }, + "taint": { + "module": "commands.taint", + "help": "AST-based taint analysis", + }, + "binary-scan": { + "module": "commands.binary_scan", + "help": "Binary/compiled artifact reverse-engineering", + }, + "regex-audit": { + "module": "commands.regex_audit", + "help": "Regex ReDoS and issue audit", + }, +} + +ALL_CHECKS = list(_CHECKS.keys()) + + +def add_args(parser): + """Add security-specific arguments to the parser.""" + parser.formatter_class = argparse.RawDescriptionHelpFormatter + parser.epilog = ( + "Sub-analyses (issue #195):\n" + " secrets Hardcoded secrets and API keys\n" + " vuln-scan Dependency CVE scan (OSV.dev + native audit)\n" + " taint AST-based taint analysis\n" + " binary-scan Binary/compiled artifact reverse-engineering\n" + " regex-audit Regex ReDoS and issue audit\n" + "\n" + "Examples:\n" + " codelens security . # all checks\n" + " codelens security . --check secrets # only secrets\n" + " codelens security . --check taint,vuln-scan # pick subset\n" + " codelens security . --check binary-scan --deep # deep binary scan\n" + ) + parser.add_argument("workspace", nargs="?", default=None, + help="Path to workspace root (auto-detected if omitted)") + parser.add_argument("--check", default=None, + help=f"Comma-separated sub-analyses. " + f"Choices: {', '.join(ALL_CHECKS)}. Default: all.") + # Common passthroughs. + parser.add_argument("--max-files", type=int, default=None, + help="secrets/regex-audit: file cap") + parser.add_argument("--severity", default=None, + help="secrets/vuln-scan/regex-audit/taint: severity filter") + parser.add_argument("--no-gitleaks", action="store_true", default=False, + help="secrets: force regex fallback (skip gitleaks)") + parser.add_argument("--language", default=None, + help="taint: python|javascript|typescript") + parser.add_argument("--with-secrets", action="store_true", default=False, + help="taint: include secret-leak findings") + parser.add_argument("--cross-file", action="store_true", default=False, + help="taint: enable cross-file analysis") + parser.add_argument("--no-ast", action="store_true", default=False, + help="taint: use semantic engine instead of AST") + parser.add_argument("--ast", action="store_true", default=False, + help="taint: force AST engine") + parser.add_argument("--deep", action="store_true", default=False, + help="binary-scan: parse source maps + extract WASM exports/imports") + parser.add_argument("--offline", action="store_true", default=False, + help="vuln-scan: skip OSV API calls, use cached results") + parser.add_argument("--refresh", action="store_true", default=False, + help="vuln-scan: force-refresh the OSV cache") + parser.add_argument("--osv-ttl", type=int, default=None, + help="vuln-scan: OSV cache TTL in seconds") + parser.add_argument("--max-age", default=None, + help="vuln-scan: max cache age (e.g. 6h, 30m, 2d)") + + +def _parse_checks(check_arg: str) -> List[str]: + if not check_arg: + return list(ALL_CHECKS) + parts = [c.strip() for c in check_arg.split(",") if c.strip()] + invalid = [p for p in parts if p not in _CHECKS] + if invalid: + print( + f"[CodeLens] security: unknown --check category '{','.join(invalid)}'. " + f"Valid: {', '.join(ALL_CHECKS)}", + file=sys.stderr, + ) + sys.exit(1) + return parts or list(ALL_CHECKS) + + +def _build_namespace(base_args, check_name: str) -> argparse.Namespace: + ns = argparse.Namespace() + for attr in ("format", "top", "max_tokens", "lite", "deep", "db_path", + "diff_base", "diff_scope", "disable_suppression", + "codelens_ignore_pattern"): + setattr(ns, attr, getattr(base_args, attr, None)) + ns.workspace = getattr(base_args, "workspace", None) + if check_name == "secrets": + ns.severity = getattr(base_args, "severity", None) + ns.max_files = getattr(base_args, "max_files", None) or 5000 + ns.no_gitleaks = getattr(base_args, "no_gitleaks", False) + elif check_name == "vuln-scan": + ns.severity = getattr(base_args, "severity", None) + ns.offline = getattr(base_args, "offline", False) + ns.osv_ttl = getattr(base_args, "osv_ttl", None) or 86400 + ns.refresh = getattr(base_args, "refresh", False) + ns.max_age = getattr(base_args, "max_age", None) + elif check_name == "taint": + ns.language = getattr(base_args, "language", None) + ns.with_secrets = getattr(base_args, "with_secrets", False) + ns.severity = getattr(base_args, "severity", None) + ns.cross_file = getattr(base_args, "cross_file", False) + ns.no_ast = getattr(base_args, "no_ast", False) + ns.ast = getattr(base_args, "ast", False) + elif check_name == "binary-scan": + # binary-scan reads args.deep via getattr default False, so carry it. + ns.deep = getattr(base_args, "deep", False) + elif check_name == "regex-audit": + ns.severity = getattr(base_args, "severity", None) + ns.max_files = getattr(base_args, "max_files", None) or 1000 + return ns + + +def execute(args, workspace): + """Run one or more security checks and merge results. + + @FLOW: SECURITY_DISPATCH + @CALLS: _parse_checks() -> List[str] + _build_namespace() -> argparse.Namespace + commands..execute() -> dict per sub + @MUTATES: OSV cache (vuln-scan may refresh it) + """ + checks = _parse_checks(getattr(args, "check", None)) + results: List[Dict[str, Any]] = [] + stats: Dict[str, Any] = {"checks_run": 0, "checks_failed": 0} + + for check_name in checks: + spec = _CHECKS[check_name] + try: + mod = importlib.import_module(spec["module"]) + sub_args = _build_namespace(args, check_name) + sub_result = mod.execute(sub_args, workspace) + if not isinstance(sub_result, dict): + sub_result = {"status": "ok", "result": sub_result} + sub_result["_check"] = check_name + results.append(sub_result) + stats["checks_run"] += 1 + except Exception as exc: + stats["checks_failed"] += 1 + stats["checks_run"] += 1 + results.append({ + "_check": check_name, + "s": "error", + "error": str(exc), + "error_type": type(exc).__name__, + }) + print( + f"[CodeLens] security: --check {check_name} failed: {exc}", + file=sys.stderr, + ) + + return { + "s": "ok" if stats["checks_failed"] == 0 else "partial", + "st": { + "checks_requested": len(checks), + **stats, + }, + "r": results, + } + + +register_command( + "security", + "Security scans: secrets / vuln-scan / taint / binary-scan / regex-audit", + add_args, + execute, +) diff --git a/scripts/commands/self_analyze.py b/scripts/commands/self_analyze.py deleted file mode 100644 index ed620b25..00000000 --- a/scripts/commands/self_analyze.py +++ /dev/null @@ -1,202 +0,0 @@ -"""Self-analyze command — Run CodeLens on its own codebase for meta-analysis. - -This is a fun meta-command that shows CodeLens eating its own dog food. -It runs multiple analysis commands on the CodeLens source tree itself -and presents a consolidated health report. -""" - -import os -import sys -import argparse -from typing import Any, Dict - - -def add_args(parser: argparse.ArgumentParser) -> None: - """Add self-analyze-specific arguments.""" - parser.add_argument( - "--quick", action="store_true", default=False, - help="Quick mode: only run fast analyses" - ) - parser.add_argument( - "--focus", choices=["smell", "security", "complexity", "dead-code", "all"], - default="all", help="Focus area for self-analysis" - ) - - -def execute(args: argparse.Namespace, workspace: str) -> Dict[str, Any]: - """Run CodeLens on its own codebase.""" - # Determine the CodeLens source directory - scripts_dir = os.path.dirname(os.path.abspath(__file__)) - codelens_root = os.path.dirname(scripts_dir) - - results = { - "status": "ok", - "command": "self-analyze", - "workspace": codelens_root, - "focus": args.focus if hasattr(args, 'focus') else 'all', - "analyses": {}, - } - - analyses_to_run = _get_analyses(args) - - for analysis_name, analysis_fn in analyses_to_run: - try: - result = analysis_fn(codelens_root, quick=getattr(args, 'quick', False)) - results["analyses"][analysis_name] = result - except Exception as e: - results["analyses"][analysis_name] = { - "status": "error", - "error": str(e), - } - - # Compute overall health - results["overall"] = _compute_overall_health(results["analyses"]) - results["dogfood"] = True # 😋 We're eating our own dog food - - return results - - -def _get_analyses(args: argparse.Namespace): - """Get list of analyses to run based on focus.""" - focus = getattr(args, 'focus', 'all') - all_analyses = [ - ("smell", _run_smell), - ("complexity", _run_complexity), - ("dead-code", _run_dead_code), - ("security", _run_security), - ("circular", _run_circular), - ] - - if focus == "all": - return all_analyses - - return [(name, fn) for name, fn in all_analyses if name == focus] - - -def _run_smell(workspace: str, quick: bool = False) -> Dict[str, Any]: - """Run smell analysis on self.""" - try: - from smell_engine import detect_smells - max_files = 500 if quick else 3000 - return detect_smells(workspace, max_files=max_files) - except Exception as e: - return {"status": "error", "error": str(e)} - - -def _run_complexity(workspace: str, quick: bool = False) -> Dict[str, Any]: - """Run complexity analysis on self.""" - try: - from complexity_engine import compute_complexity - max_files = 500 if quick else 3000 - return compute_complexity(workspace, max_files=max_files) - except Exception as e: - return {"status": "error", "error": str(e)} - - -def _run_dead_code(workspace: str, quick: bool = False) -> Dict[str, Any]: - """Run dead-code detection on self.""" - try: - from deadcode_engine import detect_dead_code - max_files = 500 if quick else 3000 - return detect_dead_code(workspace, max_files=max_files) - except Exception as e: - return {"status": "error", "error": str(e)} - - -def _run_security(workspace: str, quick: bool = False) -> Dict[str, Any]: - """Run security analysis (secrets + taint) on self.""" - results = {} - try: - from secrets_engine import detect_secrets - results["secrets"] = detect_secrets(workspace, max_files=3000) - except Exception as e: - results["secrets"] = {"status": "error", "error": str(e)} - - try: - from semantic_engine import analyze_workspace - results["taint"] = analyze_workspace(workspace, language="python") - except Exception as e: - results["taint"] = {"status": "error", "error": str(e)} - - return results - - -def _run_circular(workspace: str, quick: bool = False) -> Dict[str, Any]: - """Run circular dependency detection on self.""" - try: - from circular_engine import detect_circular - return detect_circular(workspace) - except Exception as e: - return {"status": "error", "error": str(e)} - - -def _compute_overall_health(analyses: Dict[str, Any]) -> Dict[str, Any]: - """Compute an overall health assessment from all analysis results.""" - health_score = 100 - issues = [] - - # Deduct for smells - smell = analyses.get("smell", {}) - if isinstance(smell, dict) and smell.get("status") != "error": - hs = smell.get("health_score", 100) - health_score = min(health_score, hs) - findings = smell.get("total_findings", 0) - if findings > 0: - issues.append(f"Smell: {findings} code smell(s) detected") - - # Deduct for complexity - complexity = analyses.get("complexity", {}) - if isinstance(complexity, dict) and complexity.get("status") != "error": - stats = complexity.get("stats", {}) - high = stats.get("high_complexity", 0) - if high > 0: - health_score = min(health_score, max(0, health_score - high * 2)) - issues.append(f"Complexity: {high} high-complexity function(s)") - - # Deduct for dead code - dead_code = analyses.get("dead-code", {}) - if isinstance(dead_code, dict) and dead_code.get("status") != "error": - stats = dead_code.get("stats", {}) - dead = stats.get("total_dead_code", 0) - if dead > 10: - health_score = min(health_score, max(0, health_score - (dead - 10))) - issues.append(f"Dead code: {dead} unused item(s)") - - # Deduct for security issues - security = analyses.get("security", {}) - if isinstance(security, dict): - secrets = security.get("secrets", {}) - if isinstance(secrets, dict) and secrets.get("status") != "error": - found = secrets.get("stats", {}).get("total_secrets", 0) - if found > 0: - health_score = min(health_score, max(0, health_score - found * 10)) - issues.append(f"Security: {found} potential secret(s) found") - - # Grade - if health_score >= 90: - grade = "A" - elif health_score >= 75: - grade = "B" - elif health_score >= 60: - grade = "C" - elif health_score >= 40: - grade = "D" - else: - grade = "F" - - return { - "health_score": health_score, - "grade": grade, - "issues": issues, - "total_analyses": len(analyses), - } - - -from commands import register_command - -register_command( - "self-analyze", - "Run CodeLens on its own codebase (dogfooding / meta-analysis)", - add_args, - execute, -) diff --git a/scripts/commands/semantic_query.py b/scripts/commands/semantic_query.py index de363926..9d8a4a3b 100644 --- a/scripts/commands/semantic_query.py +++ b/scripts/commands/semantic_query.py @@ -63,4 +63,6 @@ def execute(args, workspace) -> Dict[str, Any]: "Semantic symbol search via TF-IDF (find symbols by meaning, not just name)", add_args, execute, +hidden=True, +deprecated_alias_for='search', ) diff --git a/scripts/commands/serve.py b/scripts/commands/serve.py deleted file mode 100644 index 9fa39d3a..00000000 --- a/scripts/commands/serve.py +++ /dev/null @@ -1,66 +0,0 @@ -"""Serve command — Start CodeLens MCP server for AI agent integration. - -Provides persistent JSON-RPC server mode over stdio (MCP protocol). -AI agents can connect and call any CodeLens command as an MCP tool -without the cold-start overhead of spawning a new process each time. - -Usage: - codelens serve # Start MCP server (stdio transport) - codelens serve --watch # Auto-watch mode for live updates - codelens serve --port 8080 # HTTP/SSE transport (optional) - codelens serve --config # Print MCP client configuration -""" - -import os -import json -from commands import register_command - - -def add_args(parser): - parser.add_argument("workspace", nargs="?", default=None, - help="Path to workspace root (auto-detected if omitted, used for --watch)") - parser.add_argument("--watch", action="store_true", default=False, - help="Auto-watch files for live registry updates") - parser.add_argument("--port", type=int, default=None, - help="HTTP/SSE port for remote access (in addition to stdio)") - parser.add_argument("--config", action="store_true", default=False, - help="Print MCP client configuration for Claude Desktop, Cursor, etc.") - - -def execute(args, workspace): - """Execute the serve command. Starts the MCP server.""" - if getattr(args, 'config', False): - # --config is a quick one-shot command, print directly and exit - import sys - from mcp_server import generate_mcp_config - config = generate_mcp_config() - result = { - "status": "ok", - "message": "MCP client configurations generated. Add to your AI tool's config file.", - "configs": config - } - print(json.dumps(result, indent=2, ensure_ascii=False)) - sys.exit(0) - - from mcp_server import run_mcp_server - watch = getattr(args, 'watch', False) - port = getattr(args, 'port', None) - - # This is a long-running command — it takes over the process - run_mcp_server(watch=watch, port=port) - - return {"status": "stopped"} - - -def _print_config(): - """Print MCP client configuration for popular AI tools.""" - from mcp_server import generate_mcp_config - config = generate_mcp_config() - return { - "status": "ok", - "message": "MCP client configurations generated. Add to your AI tool's config file.", - "configs": config - } - - -register_command("serve", "Start MCP server for AI agent integration (JSON-RPC over stdio)", add_args, execute) diff --git a/scripts/commands/sessions.py b/scripts/commands/sessions.py deleted file mode 100644 index b695d1cc..00000000 --- a/scripts/commands/sessions.py +++ /dev/null @@ -1,407 +0,0 @@ -"""CodeLens ``sessions`` command — installer session log viewer (issue #64, Phase 2). - -What this command does ------------------------ -``codelens sessions`` reads the install-session log that ``setup.sh`` -appends to on every run, and displays the last N sessions in a -human-readable format (or JSON for programmatic access). - -The session log lives at ``~/.codelens/session.md`` (Markdown) with a -JSON sidecar at ``~/.codelens/session.json`` for structured access. -Each session entry records: - -* Timestamp (ISO 8601, UTC) -* Duration (seconds) -* Python / OS / arch -* Detected agents (Claude Code, Cursor, etc.) with ✓/✗ -* Configured integrations -* Dependencies installed -* Warnings and errors - -This is the "what happened last time I ran setup?" debugging tool — -useful when an install partially fails and the user wants to see -what changed. - -Why a separate command (not just ``cat ~/.codelens/session.md``)? ------------------------------------------------------------------ -* **Filtering** — ``--entries N`` shows the last N sessions without - dumping the whole file. -* **Rotation** — when the log exceeds 1 MB, the oldest sessions are - trimmed automatically (keep last 50). -* **JSON output** — ``--json`` parses the sidecar and emits a - structured array for CI / programmatic consumption. -* **Custom config dir** — ``--config-dir`` lets users inspect a - non-default ``~/.codelens/`` location (useful for debugging - containerized installs). -* **Doctor integration** — ``doctor`` can call ``sessions --json`` - to surface "last install had 2 warnings" in its diagnostic. - -What Phase 2 deliberately does NOT do -------------------------------------- -* It does not parse legacy install logs from before this feature - shipped (those entries simply don't exist). -* It does not sync sessions across machines (that's a future - cloud-sync feature, out of scope). -* It does not write sessions — only ``setup.sh`` writes them. This - command is read-only. -""" - -from __future__ import annotations - -import json -import os -import re -import sys -import time -from datetime import datetime, timezone -from typing import Any, Dict, List, Optional - -from commands import register_command - -# ─── Constants ───────────────────────────────────────────────── - -# Default location of the session log. Override per-call with -# ``--config-dir``. -DEFAULT_CONFIG_DIR = os.path.expanduser("~/.codelens") -SESSION_MD_FILENAME = "session.md" -SESSION_JSON_FILENAME = "session.json" - -# Rotation thresholds — when the JSON sidecar exceeds this size, -# trim to the most recent N entries. -MAX_LOG_SIZE_BYTES = 1 * 1024 * 1024 # 1 MB -MAX_SESSIONS_AFTER_ROTATION = 50 - -# Default number of sessions to display. -DEFAULT_ENTRIES = 5 - - -# ─── Session log paths ───────────────────────────────────────── - - -def _session_paths(config_dir: Optional[str]) -> tuple[str, str]: - """Return ``(md_path, json_path)`` for the given config dir. - - Falls back to :data:`DEFAULT_CONFIG_DIR` when ``config_dir`` is - None or empty. - """ - base = config_dir or DEFAULT_CONFIG_DIR - return ( - os.path.join(base, SESSION_MD_FILENAME), - os.path.join(base, SESSION_JSON_FILENAME), - ) - - -# ─── Session log reading ─────────────────────────────────────── - - -def _load_json_sessions(json_path: str) -> List[Dict[str, Any]]: - """Load the JSON sidecar and return the sessions list. - - Returns an empty list if the file doesn't exist, is empty, or - fails to parse. A corrupted sidecar must NOT crash the command — - we degrade gracefully to "no sessions found" and let the user - inspect the raw Markdown file with ``--raw``. - """ - if not os.path.exists(json_path): - return [] - try: - with open(json_path, "r", encoding="utf-8") as f: - data = json.load(f) - if isinstance(data, list): - return data - if isinstance(data, dict) and "sessions" in data: - # Future-proof: support a wrapped schema where the - # top-level object has a ``sessions`` key. - sessions = data["sessions"] - return sessions if isinstance(sessions, list) else [] - return [] - except (OSError, json.JSONDecodeError): - return [] - - -def _parse_md_sessions(md_path: str) -> List[Dict[str, Any]]: - """Parse the Markdown log into a list of session dicts. - - Each session in the Markdown log is delimited by a level-2 - heading (``##``). The heading line contains the ISO 8601 - timestamp. Subsequent lines until the next ``##`` are the - session body (key-value pairs in ``- **key**: value`` format). - - This is a best-effort parser — the JSON sidecar is the source of - truth; the Markdown is for human reading. We only fall back to - parsing the Markdown when the sidecar is missing or empty. - """ - if not os.path.exists(md_path): - return [] - try: - with open(md_path, "r", encoding="utf-8") as f: - content = f.read() - except OSError: - return [] - - sessions: List[Dict[str, Any]] = [] - current: Optional[Dict[str, Any]] = None - current_body: List[str] = [] - - for line in content.splitlines(): - # A level-2 heading starts a new session. - if line.startswith("## "): - # Flush the previous session. - if current is not None: - current["body"] = "\n".join(current_body).strip() - sessions.append(current) - # Parse the heading: ``## 2026-06-28T09:14:31Z — setup`` - heading = line[3:].strip() - # Split on em-dash if present (the convention from setup.sh). - parts = re.split(r"\s+[—-]\s+", heading, maxsplit=1) - timestamp_str = parts[0].strip() - title = parts[1].strip() if len(parts) > 1 else "session" - current = { - "timestamp": timestamp_str, - "title": title, - "raw_heading": heading, - } - current_body = [] - elif current is not None: - current_body.append(line) - - # Flush the last session. - if current is not None: - current["body"] = "\n".join(current_body).strip() - sessions.append(current) - - return sessions - - -def _load_sessions(config_dir: Optional[str]) -> List[Dict[str, Any]]: - """Load sessions, preferring the JSON sidecar. - - Falls back to parsing the Markdown log if the sidecar is missing - or empty. Returns an empty list if neither exists. - """ - md_path, json_path = _session_paths(config_dir) - sessions = _load_json_sessions(json_path) - if sessions: - return sessions - return _parse_md_sessions(md_path) - - -# ─── Rotation ────────────────────────────────────────────────── - - -def _maybe_rotate(config_dir: Optional[str]) -> bool: - """Trim the session log if it exceeds :data:`MAX_LOG_SIZE_BYTES`. - - Returns ``True`` if rotation happened, ``False`` otherwise. - Rotation keeps the most recent :data:`MAX_SESSIONS_AFTER_ROTATION` - sessions in both the JSON sidecar and the Markdown log. - - This is called automatically on every ``sessions`` invocation — - no need for a separate cron job. - """ - md_path, json_path = _session_paths(config_dir) - # Check the JSON sidecar size (the smaller of the two; if it's - # over the threshold, the Markdown is definitely over too). - try: - json_size = os.path.getsize(json_path) if os.path.exists(json_path) else 0 - except OSError: - json_size = 0 - if json_size < MAX_LOG_SIZE_BYTES: - return False - - sessions = _load_json_sessions(json_path) - if len(sessions) <= MAX_SESSIONS_AFTER_ROTATION: - return False - - # Keep the most recent N sessions. Sessions are appended in - # chronological order, so the most recent are at the end. - trimmed = sessions[-MAX_SESSIONS_AFTER_ROTATION:] - try: - with open(json_path, "w", encoding="utf-8") as f: - json.dump(trimmed, f, indent=2, ensure_ascii=False) - except OSError: - return False - - # Rewrite the Markdown log with only the trimmed sessions. - # We don't try to reconstruct the original Markdown formatting - # exactly — we just emit a fresh, valid log from the JSON data. - try: - with open(md_path, "w", encoding="utf-8") as f: - f.write("# CodeLens install sessions\n\n") - for s in trimmed: - ts = s.get("timestamp", "unknown") - title = s.get("title", "session") - f.write(f"## {ts} — {title}\n\n") - body = s.get("body") - if body: - f.write(body + "\n\n") - else: - # Emit structured fields if no body. - for k, v in s.items(): - if k in ("timestamp", "title", "raw_heading", "body"): - continue - f.write(f"- **{k}**: {v}\n") - f.write("\n") - except OSError: - pass - return True - - -# ─── Output formatting ───────────────────────────────────────── - - -def _format_text(sessions: List[Dict[str, Any]], entries: int) -> str: - """Format the last N sessions as a human-readable text report.""" - if not sessions: - return "No install sessions found. Run `bash setup.sh` to record one." - # Take the last N sessions (most recent first for display). - recent = sessions[-entries:] if entries > 0 else sessions - recent = list(reversed(recent)) # most recent first - - lines: List[str] = [] - lines.append(f"CodeLens sessions — showing {len(recent)} of {len(sessions)} total") - lines.append("=" * 60) - for i, s in enumerate(recent, 1): - ts = s.get("timestamp", "?") - title = s.get("title", "session") - lines.append(f"\n[{i}] {ts} — {title}") - # Show structured fields if present. - for key in ("duration_sec", "python", "os", "arch", "agents_detected", - "integrations_configured", "deps_installed", "warnings", "errors"): - if key in s: - val = s[key] - if isinstance(val, (list, dict)): - val = json.dumps(val, ensure_ascii=False) - lines.append(f" {key}: {val}") - # If there's a body (from MD parsing), show a truncated version. - body = s.get("body") - if body and not any(k in s for k in ("duration_sec", "python")): - # No structured fields — show first 5 lines of body. - body_lines = body.splitlines()[:5] - for bl in body_lines: - if bl.strip(): - lines.append(f" {bl}") - if len(body.splitlines()) > 5: - lines.append(f" ... ({len(body.splitlines()) - 5} more lines)") - lines.append("\n" + "=" * 60) - lines.append(f"Total sessions logged: {len(sessions)}") - return "\n".join(lines) - - -def _format_raw(md_path: str) -> str: - """Return the raw Markdown log content verbatim.""" - if not os.path.exists(md_path): - return f"Session log not found at {md_path}" - try: - with open(md_path, "r", encoding="utf-8") as f: - return f.read() - except OSError as exc: - return f"Failed to read {md_path}: {exc}" - - -# ─── CLI plumbing ────────────────────────────────────────────── - - -def add_args(parser): - """Register sessions-specific arguments.""" - parser.add_argument( - "workspace", - nargs="?", - default=None, - help="Ignored (sessions is global). Accepted for CLI consistency.", - ) - parser.add_argument( - "--entries", - type=int, - default=DEFAULT_ENTRIES, - help=f"Number of recent sessions to display (default: {DEFAULT_ENTRIES}). Use 0 for all.", - ) - parser.add_argument( - "--raw", - action="store_true", - default=False, - help="Print the raw Markdown log verbatim (no formatting).", - ) - parser.add_argument( - "--config-dir", - default=None, - help=f"Custom config dir (default: {DEFAULT_CONFIG_DIR}).", - ) - parser.add_argument( - "--json", - dest="json_output", - action="store_true", - default=False, - help="Output as JSON array (for programmatic access).", - ) - - -def execute(args, workspace): - """Read the session log, optionally rotate, return result dict. - - The result always includes: - - * ``status`` — "ok" | "error" - * ``config_dir`` — resolved config dir - * ``total_sessions`` — count of sessions in the log - * ``returned_sessions`` — count actually returned (after ``--entries``) - * ``sessions`` — list of session dicts (the data) - * ``rotated`` — bool, whether rotation happened during this call - * ``raw`` — the raw Markdown content (only if ``--raw``) - """ - config_dir = getattr(args, "config_dir", None) or DEFAULT_CONFIG_DIR - entries = getattr(args, "entries", DEFAULT_ENTRIES) - raw_mode = bool(getattr(args, "raw", False)) - json_mode = bool(getattr(args, "json_output", False)) - - md_path, json_path = _session_paths(config_dir) - - # Rotate if needed (side effect — but safe and idempotent). - rotated = _maybe_rotate(config_dir) - - sessions = _load_sessions(config_dir) - - # Apply --entries limit (0 = all). - if entries > 0: - display = sessions[-entries:] - else: - display = sessions - - result: Dict[str, Any] = { - "status": "ok", - "config_dir": config_dir, - "md_path": md_path, - "json_path": json_path, - "total_sessions": len(sessions), - "returned_sessions": len(display), - "sessions": display, - "rotated": rotated, - } - - if raw_mode: - result["raw"] = _format_raw(md_path) - # In raw mode, print the raw content directly and signal to - # the dispatcher that we've already printed. - print(result["raw"]) - result["_sessions_printed_text"] = True - elif json_mode: - # In JSON mode, print just the sessions array (so ``jq`` etc. - # can pipe it cleanly). The full result dict is still - # returned for the dispatcher, but we override the printed - # output here. - print(json.dumps(display, indent=2, ensure_ascii=False)) - result["_sessions_printed_text"] = True - else: - # Default text mode — print the human-readable report. - print(_format_text(sessions, entries)) - result["_sessions_printed_text"] = True - - return result - - -register_command( - "sessions", - "View recent install sessions (from setup.sh session log)", - add_args, - execute, -) diff --git a/scripts/commands/side_effect.py b/scripts/commands/side_effect.py index 9cd53b3a..ee50284f 100644 --- a/scripts/commands/side_effect.py +++ b/scripts/commands/side_effect.py @@ -22,4 +22,10 @@ def execute(args, workspace): ) -register_command("side-effect", "Analyze function side effects (pure vs impure)", add_args, execute) +register_command("side-effect", "Analyze function side effects (pure vs impure)", add_args, execute, + +hidden=True, + +deprecated_alias_for='audit', + +) diff --git a/scripts/commands/smell.py b/scripts/commands/smell.py index 75d97f97..16e42fcc 100644 --- a/scripts/commands/smell.py +++ b/scripts/commands/smell.py @@ -42,4 +42,10 @@ def execute(args, workspace): return result -register_command("smell", "Detect code smells across workspace", add_args, execute) +register_command("smell", "Detect code smells across workspace", add_args, execute, + +hidden=True, + +deprecated_alias_for='audit', + +) diff --git a/scripts/commands/stack_trace.py b/scripts/commands/stack_trace.py deleted file mode 100644 index 0038e5b3..00000000 --- a/scripts/commands/stack_trace.py +++ /dev/null @@ -1,31 +0,0 @@ -"""Stack-trace command — Error propagation simulation.""" - -from stacktrace_engine import trace_error_propagation -from commands import register_command - - -def add_args(parser): - parser.add_argument("name", help="Function name that might throw") - parser.add_argument("workspace", nargs="?", default=None, - help="Path to workspace root (auto-detected if omitted)") - parser.add_argument("--error-type", default=None, help="Error type (e.g., TypeError)") - parser.add_argument("--depth", type=int, default=20, help="Max trace depth (default 20)") - - -def execute(args, workspace): - # Validate: if 'name' looks like a path, it was likely meant as workspace - name = args.name - if name and (os.path.isabs(name) or name.startswith('./') or name.startswith('../')): - # User probably omitted function name and passed workspace as first arg - # Swap: use name as workspace, leave function name empty - workspace = name - name = "" - return trace_error_propagation( - name, workspace, - error_type=args.error_type, - max_depth=args.depth - ) - - -import os -register_command("stack-trace", "Error propagation simulation", add_args, execute) diff --git a/scripts/commands/staleness.py b/scripts/commands/staleness.py index 63ac327b..8a7033fa 100644 --- a/scripts/commands/staleness.py +++ b/scripts/commands/staleness.py @@ -144,4 +144,6 @@ def execute(args: argparse.Namespace, workspace: str) -> Dict[str, Any]: "List files whose index entry is stale (issue #66 Phase 1)", add_args, execute, +hidden=True, +deprecated_alias_for='audit', ) diff --git a/scripts/commands/state_map.py b/scripts/commands/state_map.py index 9da3b4a0..2589010a 100644 --- a/scripts/commands/state_map.py +++ b/scripts/commands/state_map.py @@ -15,4 +15,8 @@ def execute(args, workspace): return map_state(workspace, store_name=args.store_name) -register_command("state-map", "Track global state management", add_args, execute) +register_command("state-map", "Track global state management", add_args, execute, + +hidden=True, + +) diff --git a/scripts/commands/summary.py b/scripts/commands/summary.py index 50b0dac6..bddd78cd 100644 --- a/scripts/commands/summary.py +++ b/scripts/commands/summary.py @@ -92,36 +92,69 @@ def add_args(parser): # Issue #180: surface noise-reduction flags directly in `codelens summary --help`. # summary already auto-adapts detail level to codebase size; the epilog points # users at the additional output-shaping flags added by the dispatcher. + # Issue #195: --check dispatches to absorbed commands (dashboard, arch-metrics, + # architecture). Without --check, runs the legacy summary aggregator. parser.formatter_class = argparse.RawDescriptionHelpFormatter parser.epilog = ( "Notes:\n" " For AI/script consumption, use --format compact (token-efficient\n" " single-char keys) or --lite (minimal output). For large repos,\n" - " --detail minimal restricts findings to critical severity only." + " --detail minimal restricts findings to critical severity only.\n" + "\n" + "Issue #195 sub-analyses (use --check to dispatch):\n" + " dashboard Generate HTML visualization dashboard\n" + " arch-metrics Architecture metrics (fan-in/out, instability, god-module)\n" + " architecture Single-call codebase overview for AI agents\n" + " summary Legacy auto-summary with prioritized findings (default)\n" ) parser.add_argument("workspace", nargs="?", default=None, help="Path to workspace root (auto-detected if omitted)") + parser.add_argument("--check", default=None, + help="Issue #195: comma-separated sub-analyses. " + "Choices: summary, dashboard, arch-metrics, architecture. " + "Default: summary (legacy aggregator).") parser.add_argument("--focus", choices=["security", "quality", "architecture", "all"], default="all", - help="Focus area for the summary (default: all)") + help="summary: focus area (default: all)") parser.add_argument("--max-items", type=int, default=10, - help="Maximum items per category (default: 10)") + help="summary: maximum items per category (default: 10)") parser.add_argument("--detail", choices=["minimal", "standard", "full", "auto"], default="auto", - help="Detail level: minimal (critical only), standard (critical+high), " - "full (all), auto (adapts to codebase size, default)") + help="summary: detail level (default: auto)") parser.add_argument("--max-files", type=int, default=2000, - help="Maximum number of files to scan (default: 2000). " + help="summary: maximum number of files to scan (default: 2000). " "Prevents timeout on very large repos.") parser.add_argument("--timeout", type=int, default=120, - help="Total time budget in seconds for all engines (default: 120)") + help="summary: total time budget in seconds (default: 120)") parser.add_argument("--write-agent-md", action="store_true", - help="Write a condensed AGENT.md file to .codelens/ for AI context") + help="summary: write a condensed AGENT.md file to .codelens/ for AI context") parser.add_argument("--max-tokens", type=int, default=8000, - help="Approximate max output tokens before smart truncation (default: 8000)") + help="summary: approximate max output tokens before smart truncation (default: 8000)") + # dashboard passthroughs + parser.add_argument("--output", "-o", default=None, + help="dashboard: output HTML path") + parser.add_argument("--open", action="store_true", default=False, + help="dashboard: open in browser after generation") + parser.add_argument("--compare", nargs=2, default=None, metavar=("SNAP1", "SNAP2"), + help="dashboard: compare two snapshots") + # arch-metrics passthroughs + parser.add_argument("--threshold-fanin", type=int, default=None, + help="arch-metrics: fan-in threshold (default 10)") + parser.add_argument("--threshold-fanout", type=int, default=None, + help="arch-metrics: fan-out threshold (default 15)") + parser.add_argument("--sort-by", default=None, + help="arch-metrics: instability|fan-in|fan-out|name (default instability)") + # architecture passthroughs + parser.add_argument("--no-cache", action="store_true", default=False, + help="architecture: bypass .codelens/architecture_cache.json") def execute(args, workspace): + # Issue #195: dispatch to absorbed sub-commands when --check is set. + check_arg = getattr(args, "check", None) + if check_arg: + return _dispatch_subcommands(args, workspace, check_arg) + # Default: legacy summary aggregator. max_files = getattr(args, 'max_files', 2000) return generate_summary( workspace, @@ -135,6 +168,98 @@ def execute(args, workspace): ) +# Issue #195: sub-command dispatch table for the summary umbrella. +_SUMMARY_SUBCOMMANDS = { + "summary": "commands.summary", # self — calls generate_summary directly + "dashboard": "commands.dashboard", + "arch-metrics": "commands.arch_metrics", + "architecture": "commands.architecture", +} + + +def _dispatch_subcommands(args, workspace, check_arg): + """Dispatch to one or more absorbed sub-commands per --check.""" + import importlib + parts = [c.strip() for c in check_arg.split(",") if c.strip()] + invalid = [p for p in parts if p not in _SUMMARY_SUBCOMMANDS] + if invalid: + import sys + print( + f"[CodeLens] summary: unknown --check category '{','.join(invalid)}'. " + f"Valid: {', '.join(_SUMMARY_SUBCOMMANDS.keys())}", + file=sys.stderr, + ) + sys.exit(1) + if not parts: + parts = ["summary"] + + results = [] + checks_failed = 0 + for check_name in parts: + try: + if check_name == "summary": + # Avoid recursion: call generate_summary directly. + sub_result = generate_summary( + workspace, + focus=args.focus, + max_items=args.max_items, + detail=args.detail, + max_files=getattr(args, "max_files", 2000), + timeout=getattr(args, "timeout", 120), + write_agent_md=getattr(args, "write_agent_md", False), + max_tokens=getattr(args, "max_tokens", 8000), + ) + else: + mod = importlib.import_module(_SUMMARY_SUBCOMMANDS[check_name]) + sub_args = _build_subnamespace(args, check_name) + sub_result = mod.execute(sub_args, workspace) + if not isinstance(sub_result, dict): + sub_result = {"status": "ok", "result": sub_result} + sub_result["_check"] = check_name + results.append(sub_result) + except Exception as exc: + checks_failed += 1 + results.append({ + "_check": check_name, + "s": "error", + "error": str(exc), + "error_type": type(exc).__name__, + }) + import sys + print(f"[CodeLens] summary: --check {check_name} failed: {exc}", + file=sys.stderr) + + return { + "s": "ok" if checks_failed == 0 else "partial", + "st": {"checks_requested": len(parts), "checks_failed": checks_failed}, + "r": results, + } + + +def _build_subnamespace(base_args, check_name): + """Build a synthetic namespace for the dispatched sub-command.""" + import argparse as _ap + ns = _ap.Namespace() + for attr in ("format", "top", "max_tokens", "lite", "deep", "db_path", + "diff_base", "diff_scope", "disable_suppression", + "codelens_ignore_pattern"): + setattr(ns, attr, getattr(base_args, attr, None)) + ns.workspace = getattr(base_args, "workspace", None) + if check_name == "dashboard": + ns.output = getattr(base_args, "output", None) + ns.open = getattr(base_args, "open", False) + ns.watch = False + ns.compare = getattr(base_args, "compare", None) + elif check_name == "arch-metrics": + ns.threshold_fanin = getattr(base_args, "threshold_fanin", None) or 10 + ns.threshold_fanout = getattr(base_args, "threshold_fanout", None) or 15 + ns.sort_by = getattr(base_args, "sort_by", None) or "instability" + elif check_name == "architecture": + ns.lite = getattr(base_args, "lite", False) + ns.no_cache = getattr(base_args, "no_cache", False) + return ns + + def _time_left(start: float, budget: float = 90) -> float: """Return remaining seconds within the time budget.""" return max(0.0, budget - (time.time() - start)) @@ -215,7 +340,7 @@ def generate_summary( # ─── 1. Quick Identity ──────────────────────────────── try: - from commands.handbook import _extract_project_identity + from handbook_helpers import _extract_project_identity identity = _extract_project_identity(workspace) result["identity"] = { "name": identity.get("name", os.path.basename(workspace)), diff --git a/scripts/commands/symbols.py b/scripts/commands/symbols.py index b9ba061c..2260cdf1 100644 --- a/scripts/commands/symbols.py +++ b/scripts/commands/symbols.py @@ -42,4 +42,10 @@ def execute(args, workspace): return result -register_command("symbols", "Search symbols in registry by name", add_args, execute) +register_command("symbols", "Search symbols in registry by name", add_args, execute, + +hidden=True, + +deprecated_alias_for='search', + +) diff --git a/scripts/commands/taint.py b/scripts/commands/taint.py index 8bb711af..d3ed6766 100644 --- a/scripts/commands/taint.py +++ b/scripts/commands/taint.py @@ -131,4 +131,10 @@ def execute(args, workspace): return result -register_command("taint", "Run AST-based taint analysis for vulnerability detection", add_args, execute) +register_command("taint", "Run AST-based taint analysis for vulnerability detection", add_args, execute, + +hidden=True, + +deprecated_alias_for='security', + +) diff --git a/scripts/commands/test_map.py b/scripts/commands/test_map.py index 1181789b..c68130cb 100644 --- a/scripts/commands/test_map.py +++ b/scripts/commands/test_map.py @@ -23,4 +23,8 @@ def execute(args, workspace): ) -register_command("test-map", "Map test coverage for functions", add_args, execute) +register_command("test-map", "Map test coverage for functions", add_args, execute, + +hidden=True, + +) diff --git a/scripts/commands/trace.py b/scripts/commands/trace.py index 19dc67f5..c5b0e078 100644 --- a/scripts/commands/trace.py +++ b/scripts/commands/trace.py @@ -82,4 +82,10 @@ def execute(args, workspace): return result -register_command("trace", "Trace deep call chain from a symbol", add_args, execute) +register_command("trace", "Trace deep call chain from a symbol", add_args, execute, + +hidden=True, + +deprecated_alias_for='context', + +) diff --git a/scripts/commands/type_infer.py b/scripts/commands/type_infer.py index 23f29729..71be86c4 100644 --- a/scripts/commands/type_infer.py +++ b/scripts/commands/type_infer.py @@ -20,4 +20,8 @@ def execute(args, workspace): ) -register_command("type-infer", "Lightweight type inference for JS/Python", add_args, execute) +register_command("type-infer", "Lightweight type inference for JS/Python", add_args, execute, + +hidden=True, + +) diff --git a/scripts/commands/vuln_scan.py b/scripts/commands/vuln_scan.py index 63052956..5e295202 100644 --- a/scripts/commands/vuln_scan.py +++ b/scripts/commands/vuln_scan.py @@ -90,4 +90,10 @@ def execute(args, workspace): ) -register_command("vuln-scan", "Scan dependencies for known CVEs (OSV.dev + native audit)", add_args, execute) +register_command("vuln-scan", "Scan dependencies for known CVEs (OSV.dev + native audit)", add_args, execute, + +hidden=True, + +deprecated_alias_for='security', + +) diff --git a/scripts/commands/watch.py b/scripts/commands/watch.py deleted file mode 100644 index bae69df0..00000000 --- a/scripts/commands/watch.py +++ /dev/null @@ -1,382 +0,0 @@ -"""Watch command — Start file watcher for real-time registry updates.""" - -import os -import sys -import time -import json -import threading -from datetime import datetime, timezone -from typing import Dict, List, Any, Optional - -from registry import load_config, load_frontend_registry, load_backend_registry -from diff_engine import save_snapshot -from outline_engine import get_workspace_outline -from utils import write_output_files, compute_summary, DEFAULT_IGNORE_DIRS, logger -from commands import register_command -from commands.scan import cmd_scan - - -# Extensions that trigger a rescan -_WATCH_EXTENSIONS = frozenset({ - '.html', '.htm', '.css', '.scss', '.less', '.sass', - '.js', '.jsx', '.ts', '.tsx', '.rs', '.py', '.vue', '.svelte', -}) - - -def add_args(parser): - parser.add_argument("workspace", nargs="?", default=None, - help="Path to workspace root (auto-detected if omitted)") - parser.add_argument("--debounce", "-d", type=float, default=0.5, - help="Debounce interval in seconds (default: 0.5)") - parser.add_argument("--git-mode", action="store_true", - help="Use git-diff polling instead of watchdog file events. " - "Checks 'git diff --name-only' every --interval seconds and " - "re-indexes only the files git knows changed. Falls back to " - "watchdog if git is unavailable or the workspace is not a " - "git repo. Default: watchdog (real-time file events).") - parser.add_argument("--interval", type=float, default=2.0, - help="Polling interval in seconds for --git-mode (default: 2.0)") - - -def execute(args, workspace): - """Execute the watch command. This is a long-running command that doesn't return a dict.""" - git_mode = getattr(args, 'git_mode', False) - interval = getattr(args, 'interval', 2.0) - cmd_watch(workspace, debounce=args.debounce, git_mode=git_mode, interval=interval) - return {"status": "stopped"} - - -def cmd_watch(workspace: str, debounce: float = 0.5, - git_mode: bool = False, interval: float = 2.0) -> None: - """ - Start file watcher for real-time registry updates. - Uses debounce to coalesce rapid file changes, prints a clean - one-line summary, and writes outline.json + summary.json to .codelens/. - - Two modes (issue #14): - - Default (watchdog): real-time file events via the watchdog library. - - --git-mode: polls 'git diff --name-only' every `interval` seconds - and re-indexes only the files git knows changed. Useful when - watchdog is unavailable or when the agent wants git-aware delta - instead of mtime-based change detection. - """ - import threading as _threading - workspace = os.path.abspath(workspace) - - if git_mode: - _watch_git_mode(workspace, interval=interval, debounce=debounce) - return - - # ─── Debounce state ──────────────────────────────────── - _timer: Optional[_threading.Timer] = None - _lock = _threading.Lock() - _changed_files: set = set() - - def _on_file_change(filepath: str) -> None: - """Called when a source file changes. Debounces rapid events.""" - ext = os.path.splitext(filepath)[1].lower() - if ext not in _WATCH_EXTENSIONS: - return - # Ignore changes inside .codelens output directory - if '.codelens' in filepath: - return - - nonlocal _timer - with _lock: - _changed_files.add(filepath) - if _timer: - _timer.cancel() - _timer = _threading.Timer(debounce, _do_rescan) - _timer.daemon = True - _timer.start() - - def _do_rescan() -> None: - """Perform the actual rescan after the debounce period.""" - with _lock: - changed = _changed_files.copy() - _changed_files.clear() - - if not changed: - return - - changed_rel = [os.path.relpath(f, workspace) for f in changed] - for rel in changed_rel: - print(f' Changed: {rel}') - - # Run incremental scan - scan_result = cmd_scan(workspace, incremental=True) - - # Auto-save snapshot - try: - frontend = load_frontend_registry(workspace) - backend = load_backend_registry(workspace) - save_snapshot(workspace, frontend, backend) - except Exception: - logger.debug("Failed to save snapshot after rescan", exc_info=True) - - # Generate outline.json + summary.json - summary = write_output_files(workspace, scan_result) - print(_format_watch_summary(summary, changed_count=len(changed))) - - # ─── Initial scan ────────────────────────────────────── - print(f'[CodeLens] Scanning {workspace}...') - scan_result = cmd_scan(workspace) - - # Auto-save snapshot - try: - frontend = load_frontend_registry(workspace) - backend = load_backend_registry(workspace) - save_snapshot(workspace, frontend, backend) - except Exception: - logger.debug("Failed to save initial snapshot", exc_info=True) - - # Generate outline.json + summary.json - summary = write_output_files(workspace, scan_result) - print(_format_watch_summary(summary)) - - # ─── Start watcher ───────────────────────────────────── - try: - from watchdog.observers import Observer - from watchdog.events import FileSystemEventHandler - except ImportError: - print('[CodeLens] watchdog not installed. Install with: pip install watchdog') - print(f'[CodeLens] Falling back to polling mode (every 2s, debounce: {debounce}s)...') - _watch_polling(workspace, debounce, _on_file_change) - return - - class CodeLensHandler(FileSystemEventHandler): - def on_modified(self, event): - if not event.is_directory: - _on_file_change(event.src_path) - - def on_created(self, event): - if not event.is_directory: - _on_file_change(event.src_path) - - def on_deleted(self, event): - if not event.is_directory: - _on_file_change(event.src_path) - - observer = Observer() - handler = CodeLensHandler() - observer.schedule(handler, workspace, recursive=True) - observer.start() - - print(f'[CodeLens] Watching {workspace} (debounce: {debounce}s) — Press Ctrl+C to stop') - try: - while True: - time.sleep(1) - except KeyboardInterrupt: - observer.stop() - print('[CodeLens] Stopped.') - observer.join() - - -def _watch_polling( - workspace: str, - debounce: float = 0.5, - on_change_callback=None -) -> None: - """ - Fallback polling-based watcher with debounce support. - Checks for file modifications every 2 seconds. - """ - import threading as _threading - - if on_change_callback is None: - _lock = _threading.Lock() - _timer = None - _pending: set = set() - - def _poll_rescan(): - nonlocal _timer - with _lock: - changed = _pending.copy() - _pending.clear() - if not changed: - return - scan_result = cmd_scan(workspace, incremental=True) - try: - frontend = load_frontend_registry(workspace) - backend = load_backend_registry(workspace) - save_snapshot(workspace, frontend, backend) - except Exception: - logger.debug("Failed to save snapshot in polling mode", exc_info=True) - summary = write_output_files(workspace, scan_result) - print(_format_watch_summary(summary, changed_count=len(changed))) - - def on_change_callback(filepath): - nonlocal _timer - ext = os.path.splitext(filepath)[1].lower() - if ext not in _WATCH_EXTENSIONS: - return - if '.codelens' in filepath: - return - with _lock: - _pending.add(filepath) - if _timer: - _timer.cancel() - _timer = _threading.Timer(debounce, _poll_rescan) - _timer.daemon = True - _timer.start() - - # Track file mtimes - last_mtimes: Dict[str, float] = {} - ignore_dirs = set(DEFAULT_IGNORE_DIRS) - - for root, dirs, filenames in os.walk(workspace): - dirs[:] = [d for d in dirs if d not in ignore_dirs and not d.startswith('.')] - for filename in filenames: - ext = os.path.splitext(filename)[1].lower() - if ext in _WATCH_EXTENSIONS: - filepath = os.path.join(root, filename) - try: - last_mtimes[filepath] = os.path.getmtime(filepath) - except OSError: - logger.debug(f"Failed to get mtime for: {filepath}") - - print(f'[CodeLens] Polling {workspace} every 2s (debounce: {debounce}s) — Press Ctrl+C to stop') - try: - while True: - time.sleep(2) - - # Check for modified/deleted files - for filepath in list(last_mtimes.keys()): - try: - current = os.path.getmtime(filepath) - if current != last_mtimes[filepath]: - last_mtimes[filepath] = current - on_change_callback(filepath) - except OSError: - del last_mtimes[filepath] - on_change_callback(filepath) - - # Check for new files - for root, dirs, filenames in os.walk(workspace): - dirs[:] = [d for d in dirs if d not in ignore_dirs and not d.startswith('.')] - for filename in filenames: - ext = os.path.splitext(filename)[1].lower() - if ext in _WATCH_EXTENSIONS: - filepath = os.path.join(root, filename) - if filepath not in last_mtimes: - try: - last_mtimes[filepath] = os.path.getmtime(filepath) - on_change_callback(filepath) - except OSError: - logger.debug(f"Failed to get mtime for new file: {filepath}") - - except KeyboardInterrupt: - print('[CodeLens] Stopped.') - - -def _watch_git_mode( - workspace: str, - interval: float = 2.0, - debounce: float = 0.5, -) -> None: - """Polling watcher that uses git-diff to detect changes (issue #14). - - Every ``interval`` seconds, runs ``git diff --name-only`` (working - tree vs HEAD) + ``git ls-files --others`` (untracked) to enumerate - the files git knows changed since the last poll. If the set is - non-empty, runs an incremental scan and prints a one-line summary. - - Falls back to ``_watch_polling`` (mtime-based) when git is - unavailable or the workspace is not a git repo — the user requested - git-mode but git isn't usable, so we degrade to the next-best - available watcher instead of refusing to start. - """ - try: - from git_aware import get_current_sha, get_changed_files, get_untracked_files - except ImportError: - print('[CodeLens] git_aware module not available; falling back to mtime polling') - _watch_polling(workspace, debounce=debounce) - return - - if not get_current_sha(workspace): - print('[CodeLens] workspace is not a git repo; falling back to mtime polling') - _watch_polling(workspace, debounce=debounce) - return - - print(f'[CodeLens] Git-mode watching {workspace} (poll: {interval}s, debounce: {debounce}s) — Press Ctrl+C to stop') - - import threading as _threading - last_seen: set = set() - _timer: Optional[_threading.Timer] = None - _lock = _threading.Lock() - _pending: set = set() - - def _do_rescan(): - nonlocal _timer - with _lock: - changed = _pending.copy() - _pending.clear() - if not changed: - return - changed_rel = sorted(os.path.relpath(f, workspace) for f in changed) - for rel in changed_rel: - print(f' Changed: {rel}') - scan_result = cmd_scan(workspace, incremental=True) - try: - frontend = load_frontend_registry(workspace) - backend = load_backend_registry(workspace) - save_snapshot(workspace, frontend, backend) - except Exception: - logger.debug("Failed to save snapshot in git-mode", exc_info=True) - summary = write_output_files(workspace, scan_result) - print(_format_watch_summary(summary, changed_count=len(changed))) - - def _schedule_rescan(filepath: str) -> None: - nonlocal _timer - with _lock: - _pending.add(filepath) - if _timer: - _timer.cancel() - _timer = _threading.Timer(debounce, _do_rescan) - _timer.daemon = True - _timer.start() - - try: - while True: - time.sleep(interval) - # Combine tracked changes (vs HEAD) + untracked new files. - rel_paths = set(get_changed_files(workspace, since_sha=None)) - rel_paths |= set(get_untracked_files(workspace)) - abs_paths = { - os.path.join(workspace, rel) - for rel in rel_paths - if os.path.splitext(rel)[1].lower() in _WATCH_EXTENSIONS - and '.codelens' not in rel - } - new_changes = abs_paths - last_seen - if new_changes: - last_seen = abs_paths - for fp in new_changes: - _schedule_rescan(fp) - else: - last_seen = abs_paths - except KeyboardInterrupt: - print('[CodeLens] Stopped.') - - -def _format_watch_summary(summary: Dict[str, Any], changed_count: int = 0) -> str: - """Format a one-line summary for terminal output.""" - now = datetime.now().strftime('%H:%M:%S') - files = summary.get('files', 0) - funcs = summary.get('functions', 0) - classes = summary.get('classes', 0) - nodes = summary.get('backend_nodes', 0) - edges = summary.get('backend_edges', 0) - - parts = [f'{files} files', f'{funcs} funcs', f'{classes} classes'] - if nodes: - parts.append(f'{nodes} nodes') - if edges: - parts.append(f'{edges} edges') - if changed_count: - parts.append(f'{changed_count} changed') - - return f'[{now}] \u2713 {" | ".join(parts)}' - - -register_command("watch", "Start file watcher with debounce", add_args, execute) diff --git a/scripts/graph_model.py b/scripts/graph_model.py index c655a36e..4375002d 100644 --- a/scripts/graph_model.py +++ b/scripts/graph_model.py @@ -13,7 +13,7 @@ - New tables `graph_nodes` and `graph_edges` are additive (prefixed `graph_` to avoid colliding with any existing table name). - The flat registry tables and JSON files are untouched. -- All 78 existing CLI commands continue to work unchanged. +- All 12 existing CLI commands continue to work unchanged. Schema: graph_nodes( diff --git a/scripts/commands/handbook.py b/scripts/handbook_helpers.py similarity index 60% rename from scripts/commands/handbook.py rename to scripts/handbook_helpers.py index 75b0549e..bd612077 100644 --- a/scripts/commands/handbook.py +++ b/scripts/handbook_helpers.py @@ -1,279 +1,16 @@ -"""Handbook command — Generate project handbook for AI agents.""" +"""Project identity helper (issue #195 consolidation). + +Extracted from the deprecated ``commands/handbook.py`` module so that +``summary`` and ``analyze`` can keep using :func:`_extract_project_identity` +after ``handbook`` is dropped as a standalone command. +""" import os import json import re -import time -from datetime import datetime, timezone -from typing import Dict, Any, List - -from registry import load_config, ensure_codelens_dir -from framework_detect import detect_frameworks -from smell_engine import detect_smells -from entrypoints_engine import map_entrypoints -from apimap_engine import map_api_routes -from statemap_engine import map_state -from circular_engine import detect_circular -from deadcode_engine import detect_dead_code -from secrets_engine import detect_secrets -from vulnscan_engine import scan_vulnerabilities -from outline_engine import get_workspace_outline -from commands import register_command -from commands.scan import cmd_scan -from utils import write_output_files, compute_summary, CODELENS_VERSION, DEFAULT_IGNORE_DIRS, logger - - -def add_args(parser): - parser.add_argument("workspace", nargs="?", default=None, - help="Path to workspace root (auto-detected if omitted)") - parser.add_argument("--max-files", type=int, default=5000, - help="Maximum number of files to scan (default: 5000). " - "Prevents timeout on very large repos.") - parser.add_argument("--timeout", type=int, default=120, - help="Total time budget in seconds for handbook generation (default: 120). " - "Remaining engines are skipped when budget is nearly exhausted.") - - -def execute(args, workspace): - max_files = getattr(args, 'max_files', 5000) - timeout = getattr(args, 'timeout', 120) - return cmd_handbook(workspace, max_files=max_files, time_budget=timeout) - - -def cmd_handbook(workspace: str, max_files: int = 5000, time_budget: int = 120) -> Dict[str, Any]: - """ - Generate a comprehensive project handbook for AI agents. - Aggregates data from multiple engines into one output. - Also writes .codelens/handbook.json and .codelens/AGENT.md. - max_files caps the scan file count to prevent timeout on huge repos. - time_budget sets a total wall-clock budget (seconds). When less than 15s - remain, remaining engines are skipped and partial=True is set in output. - """ - workspace = os.path.abspath(workspace) - config = load_config(workspace) - ensure_codelens_dir(workspace) - - start_time = time.monotonic() - engines_skipped: List[str] = [] - - def _remaining() -> float: - """Return remaining seconds before budget expires.""" - return time_budget - (time.monotonic() - start_time) - - def _should_skip(engine_name: str) -> bool: - """Check if we should skip an engine due to time budget.""" - if _remaining() < 15: - engines_skipped.append(engine_name) - logger.warning(f"Skipping {engine_name}: time budget nearly exhausted " - f"({time_budget - (time.monotonic() - start_time):.1f}s remaining)") - return True - return False - - # 1. Identity — extract from package.json / pyproject.toml / README - identity = _extract_project_identity(workspace) - - # 2. Run scan first (needed for registry data) — skip if registry is fresh - scan_result = None - registry_path = os.path.join(workspace, '.codelens', 'backend.json') - if os.path.exists(registry_path): - try: - mtime = os.path.getmtime(registry_path) - if time.time() - mtime < 300: # 5 minutes freshness - from registry import load_backend_registry, load_frontend_registry - backend = load_backend_registry(workspace) - frontend = load_frontend_registry(workspace) - scan_result = { - "status": "ok", - "backend": { - "nodes": len(backend.get("nodes", [])) if isinstance(backend.get("nodes"), list) else backend.get("nodes", 0), - "edges": len(backend.get("edges", [])) if isinstance(backend.get("edges"), list) else backend.get("edges", 0) - }, - "frontend": { - "classes": len(frontend.get("classes", [])) if isinstance(frontend.get("classes"), list) else frontend.get("classes", 0), - "ids": len(frontend.get("ids", [])) if isinstance(frontend.get("ids"), list) else frontend.get("ids", 0) - } - } - except Exception: - logger.warning("Scan result loading failed", exc_info=True) - if scan_result is None: - if _should_skip('scan'): - scan_result = {"status": "skipped"} - else: - scan_result = cmd_scan(workspace) - - # 3. Generate output files (outline.json, summary.json) - try: - write_output_files(workspace, scan_result) - except Exception: - logger.warning("Failed to write output files", exc_info=True) - - # 4. Frameworks - frameworks = config.get("frameworks", []) - if _should_skip('frameworks'): - pass # keep default from config - else: - try: - fw_result = detect_frameworks(workspace) - frameworks = fw_result.get("frameworks", []) - except Exception: - logger.warning("Framework detection failed", exc_info=True) - - # 5. Health (from smell engine) - health = {"score": 0, "smells_count": 0, "critical": 0, "warning": 0} - if _should_skip('smell'): - pass - else: - try: - smell_result = detect_smells(workspace) - health = { - "score": smell_result.get("stats", {}).get("health_score", 0), - "smells_count": smell_result.get("stats", {}).get("total_smells", 0), - "critical": smell_result.get("stats", {}).get("critical", 0), - "warning": smell_result.get("stats", {}).get("warning", 0), - } - except Exception: - logger.warning("Health detection failed", exc_info=True) +from typing import Dict, Any, List, Optional - # 6. Entrypoints - entrypoints = [] - if _should_skip('entrypoints'): - pass - else: - try: - ep_result = map_entrypoints(workspace, exclude_tests=True) - entrypoints = [ - {"type": e.get("type"), "file": e.get("file"), "line": e.get("line"), "label": e.get("label")} - for e in ep_result.get("entrypoints", [])[:30] - ] - except Exception: - logger.warning("Entrypoint mapping failed", exc_info=True) - - # 7. API Routes - api_routes = [] - if _should_skip('apimap'): - pass - else: - try: - api_result = map_api_routes(workspace) - api_routes = [ - {"method": r.get("method"), "path": r.get("path"), "handler": r.get("handler_name"), "file": r.get("file")} - for r in api_result.get("routes", [])[:50] - ] - except Exception: - logger.warning("API route mapping failed", exc_info=True) - - # 8. State management - state_stores = [] - if _should_skip('statemap'): - pass - else: - try: - state_result = map_state(workspace) - state_stores = [ - {"name": s.get("name"), "type": s.get("type"), "framework": s.get("framework"), "file": s.get("defined_in")} - for s in state_result.get("stores", [])[:20] - ] - except Exception: - logger.warning("State management mapping failed", exc_info=True) - - # 9. Risks (circular deps, dead code, secrets, vulnscan) - risks = [] - if not _should_skip('circular'): - try: - circ_result = detect_circular(workspace) - for chain in circ_result.get("chains", [])[:5]: - risks.append({"type": "circular_dep", "description": f"{' → '.join(chain.get('path', []))}"}) - except Exception: - logger.warning("Circular dependency detection failed", exc_info=True) - if not _should_skip('deadcode'): - try: - dead_result = detect_dead_code(workspace) - dead_count = dead_result.get("stats", {}).get("total_dead", 0) - if dead_count > 0: - risks.append({"type": "dead_code", "count": dead_count}) - except Exception: - logger.warning("Dead code detection failed", exc_info=True) - if not _should_skip('secrets'): - try: - secrets_result = detect_secrets(workspace) - secrets_count = secrets_result.get("stats", {}).get("total_secrets", 0) - if secrets_count > 0: - risks.append({"type": "secrets", "count": secrets_count}) - except Exception: - logger.warning("Secrets detection failed", exc_info=True) - if not _should_skip('vulnscan'): - try: - vuln_result = scan_vulnerabilities(workspace) - vuln_count = vuln_result.get("stats", {}).get("total_vulnerabilities", 0) - if vuln_count > 0: - risks.append({"type": "vulnerabilities", "count": vuln_count}) - except Exception: - logger.warning("Vulnerability scan failed", exc_info=True) - - # 10. Directory map - directory_map = _build_directory_map(workspace, config) - - # 11. Quick reference from summary - summary = {} - if not _should_skip('summary'): - try: - summary = compute_summary(workspace, get_workspace_outline(workspace), scan_result) - except Exception: - logger.warning("Summary computation failed", exc_info=True) - - # 12. Conventions - conventions = _detect_conventions(workspace) - - # Build handbook - handbook = { - "status": "ok", - "meta": { - "workspace": workspace, - "generated_at": datetime.now(timezone.utc).isoformat(), - "codelens_version": CODELENS_VERSION - }, - "identity": identity, - "frameworks": frameworks, - "structure": { - "directory_map": directory_map, - "entrypoints": entrypoints, - "api_routes": api_routes, - "state_management": state_stores - }, - "health": health, - "conventions": conventions, - "risks": risks, - "quick_reference": { - "total_files": summary.get("files", 0), - "total_functions": summary.get("functions", 0), - "total_classes": summary.get("classes", 0), - "total_exports": summary.get("exports", 0), - "backend_nodes": summary.get("backend_nodes", 0), - "backend_edges": summary.get("backend_edges", 0), - "frontend_classes": summary.get("frontend_classes", 0), - "frontend_ids": summary.get("frontend_ids", 0), - }, - "files_by_language": summary.get("files_by_language", {}) - } - - # Add partial metadata if any engines were skipped due to time budget - if engines_skipped: - handbook["partial"] = True - handbook["time_budget_used"] = round(time.monotonic() - start_time, 2) - handbook["time_budget_total"] = time_budget - handbook["engines_skipped"] = engines_skipped - - # Write handbook.json - codelens_dir = os.path.join(workspace, '.codelens') - os.makedirs(codelens_dir, exist_ok=True) - handbook_path = os.path.join(codelens_dir, 'handbook.json') - with open(handbook_path, 'w', encoding='utf-8') as f: - json.dump(handbook, f, indent=2, ensure_ascii=False) - - # Generate AGENT.md - _generate_agent_md(workspace, handbook) - - return handbook +from utils import DEFAULT_IGNORE_DIRS def _extract_project_identity(workspace: str) -> Dict[str, Any]: @@ -954,245 +691,3 @@ def _extract_project_identity(workspace: str) -> Dict[str, Any]: return identity -def _build_directory_map(workspace: str, config: Dict[str, Any]) -> Dict[str, str]: - """Build a one-level-deep directory map with descriptions.""" - ignore_dirs = set(DEFAULT_IGNORE_DIRS) - dir_hints = { - 'src': 'Application source code', - 'app': 'Application pages/routes', - 'lib': 'Shared libraries and utilities', - 'components': 'UI components', - 'pages': 'Page components', - 'api': 'API route handlers', - 'routes': 'Route definitions', - 'scripts': 'Build/utility scripts', - 'skills': 'CodeLens skill modules', - 'tests': 'Test files', - '__tests__': 'Test files', - 'test': 'Test files', - 'config': 'Configuration files', - 'public': 'Static public assets', - 'assets': 'Static assets', - 'styles': 'CSS/styling files', - 'hooks': 'Custom React hooks', - 'utils': 'Utility functions', - 'helpers': 'Helper functions', - 'services': 'Service modules', - 'models': 'Data models', - 'types': 'TypeScript type definitions', - 'interfaces': 'Interface definitions', - 'store': 'State management', - 'stores': 'State management stores', - 'middleware': 'Middleware', - 'db': 'Database files', - 'docs': 'Documentation', - 'examples': 'Example files', - 'mini-services': 'Microservices', - 'parsers': 'Parsers', - 'engines': 'Analysis engines', - } - dir_map = {} - try: - for entry in sorted(os.listdir(workspace)): - full = os.path.join(workspace, entry) - if os.path.isdir(full) and entry not in ignore_dirs and not entry.startswith('.'): - src_count = 0 - try: - for root, dirs, filenames in os.walk(full): - depth = root.replace(full, '').count(os.sep) - if depth > 3: - dirs[:] = [] - continue - dirs[:] = [d for d in dirs if d not in ignore_dirs and not d.startswith('.')] - for f in filenames: - ext = os.path.splitext(f)[1].lower() - if ext in {'.py', '.js', '.ts', '.tsx', '.jsx', '.rs', '.html', '.css', '.scss', '.vue', '.svelte'}: - src_count += 1 - except Exception: - logger.warning("Directory file counting failed", exc_info=True) - if entry.lower() in dir_hints: - desc = dir_hints[entry.lower()] - elif src_count: - desc = f"{src_count} source file{'s' if src_count != 1 else ''}" - else: - desc = "directory" - dir_map[entry + '/'] = desc - except Exception: - logger.warning("Directory map building failed", exc_info=True) - return dir_map - - -def _detect_conventions(workspace: str) -> Dict[str, Any]: - """Detect coding conventions from the codebase.""" - conventions = { - "naming": {}, - "patterns": {} - } - - # Try to import convention_engine if it exists - try: - from convention_engine import detect_conventions - result = detect_conventions(workspace) - if result.get("status") == "ok": - return result.get("conventions", conventions) - except ImportError: - pass - except Exception: - logger.warning("Convention engine failed", exc_info=True) - - # Fallback: basic convention detection from filenames - files = [] - for root, dirs, filenames in os.walk(workspace): - dirs[:] = [d for d in dirs if d not in DEFAULT_IGNORE_DIRS and not d.startswith('.')] - for fn in filenames: - ext = os.path.splitext(fn)[1].lower() - if ext in {'.py', '.js', '.ts', '.tsx', '.rs'}: - files.append(fn) - - snake_count = sum(1 for f in files if '_' in os.path.splitext(f)[0] and f == f.lower()) - kebab_count = sum(1 for f in files if '-' in os.path.splitext(f)[0] and f == f.lower()) - camel_count = sum(1 for f in files if re.match(r'^[a-z]+[A-Z]', os.path.splitext(f)[0])) - pascal_count = sum(1 for f in files if f[0].isupper() and f[0].isalpha()) - - if snake_count > kebab_count and snake_count > camel_count: - conventions["naming"]["files"] = "snake_case" - elif kebab_count > snake_count and kebab_count > camel_count: - conventions["naming"]["files"] = "kebab-case" - elif pascal_count > camel_count: - conventions["naming"]["files"] = "PascalCase" - elif camel_count > 0: - conventions["naming"]["files"] = "camelCase" - - py_files = [f for f in files if f.endswith('.py')] - js_files = [f for f in files if f.endswith(('.js', '.ts', '.tsx'))] - - if py_files: - py_snake = sum(1 for f in py_files if '_' in os.path.splitext(f)[0]) - if py_snake > len(py_files) * 0.5: - conventions["naming"]["python_files"] = "snake_case" - - if js_files: - js_kebab = sum(1 for f in js_files if '-' in os.path.splitext(f)[0]) - js_camel = sum(1 for f in js_files if re.match(r'^[a-z]+[A-Z]', os.path.splitext(f)[0])) - if js_kebab > js_camel: - conventions["naming"]["javascript_files"] = "kebab-case" - elif js_camel > 0: - conventions["naming"]["javascript_files"] = "camelCase" - - return conventions - - -def _generate_agent_md(workspace: str, handbook: Dict[str, Any]) -> None: - """Generate .codelens/AGENT.md from handbook data.""" - lines = [] - identity = handbook.get("identity", {}) - meta = handbook.get("meta", {}) - health = handbook.get("health", {}) - structure = handbook.get("structure", {}) - conventions = handbook.get("conventions", {}) - risks = handbook.get("risks", []) - qr = handbook.get("quick_reference", {}) - - lines.append(f"# Project Brief: {identity.get('name', 'unknown')}") - lines.append("") - - # Overview - lines.append("## Overview") - desc = identity.get("description", "") - if desc: - lines.append(desc) - fws = handbook.get("frameworks", []) - if fws: - lines.append(f"Frameworks: {', '.join(fws)}") - ptype = identity.get("type", "") - if ptype != "unknown": - lines.append(f"Type: {ptype}") - lines.append(f"Version: {identity.get('version', '0.0.0')}") - lines.append("") - - # Structure - dir_map = structure.get("directory_map", {}) - if dir_map: - lines.append("## Structure") - for dir_path, desc in dir_map.items(): - lines.append(f"- `{dir_path}` — {desc}") - lines.append("") - - # Entry Points - entrypoints = structure.get("entrypoints", []) - if entrypoints: - lines.append("## Key Entry Points") - for ep in entrypoints[:15]: - lines.append(f"- `{ep.get('file', '')}:{ep.get('line', '')}` — {ep.get('label', ep.get('type', ''))} ({ep.get('type', '')})") - lines.append("") - - # API Surface - api_routes = structure.get("api_routes", []) - if api_routes: - lines.append("## API Surface") - for r in api_routes[:20]: - lines.append(f"- {r.get('method', 'GET')} `{r.get('path', '/')}` — {r.get('handler', '')} ({r.get('file', '')})") - lines.append("") - - # State Management - state = structure.get("state_management", []) - if state: - lines.append("## State Management") - for s in state: - lines.append(f"- `{s.get('name', '')}` ({s.get('type', '')}, {s.get('framework', '')}) — {s.get('file', '')}") - lines.append("") - - # Conventions - naming = conventions.get("naming", {}) - patterns = conventions.get("patterns", {}) - if naming or patterns: - lines.append("## Conventions") - for key, val in naming.items(): - lines.append(f"- {key}: {val}") - for key, val in patterns.items(): - lines.append(f"- {key}: {val}") - lines.append("") - - # Health - score = health.get("score", 0) - lines.append(f"## Health Score: {score}/100") - risk_parts = [] - for r in risks: - rtype = r.get("type", "") - count = r.get("count", 0) - desc = r.get("description", "") - if count: - risk_parts.append(f"{count} {rtype.replace('_', ' ')}") - elif desc: - risk_parts.append(desc) - if risk_parts: - lines.append("- " + ", ".join(risk_parts)) - lines.append("") - - # Quick Reference - lines.append("## Quick Reference") - lines.append(f"- Files: {qr.get('total_files', 0)}") - lines.append(f"- Functions: {qr.get('total_functions', 0)}") - lines.append(f"- Classes: {qr.get('total_classes', 0)}") - lines.append(f"- Exports: {qr.get('total_exports', 0)}") - lines.append("") - - langs = handbook.get("files_by_language", {}) - if langs: - lines.append("## Languages") - for lang, count in sorted(langs.items(), key=lambda x: -x[1]): - lines.append(f"- {lang}: {count} files") - lines.append("") - - lines.append(f"## Last Scanned: {meta.get('generated_at', 'unknown')}") - lines.append("") - - content = "\n".join(lines) - codelens_dir = os.path.join(workspace, '.codelens') - os.makedirs(codelens_dir, exist_ok=True) - agent_md_path = os.path.join(codelens_dir, 'AGENT.md') - with open(agent_md_path, 'w', encoding='utf-8') as f: - f.write(content) - - -register_command("handbook", "Generate project handbook for AI agents", add_args, execute) diff --git a/scripts/mcp_server.py b/scripts/mcp_server.py index 78f43d09..7b4d0cd3 100644 --- a/scripts/mcp_server.py +++ b/scripts/mcp_server.py @@ -1578,7 +1578,13 @@ def _handle_tools_list(self) -> Dict[str, Any]: GraphML XML form for graph-producing commands. """ tools = [] + # Issue #195: skip static tool definitions for hidden deprecated + # aliases — only the 12 umbrella commands are exposed as MCP tools. + from commands import COMMAND_REGISTRY as _cr for cmd_name, tool_def in sorted(_TOOL_DEFINITIONS.items()): + _info = _cr.get(cmd_name) + if _info and _info.get("hidden", False): + continue schema = _inject_format_enum(tool_def["parameters"]) tools.append({ "name": f"codelens_{cmd_name.replace('-', '_')}", @@ -1597,7 +1603,11 @@ def _handle_tools_list(self) -> Dict[str, Any]: return {"tools": tools} def _get_dynamic_tools(self) -> List[Dict[str, Any]]: - """Dynamically generate tool definitions for any commands not in _TOOL_DEFINITIONS.""" + """Dynamically generate tool definitions for any commands not in _TOOL_DEFINITIONS. + + Issue #195: hidden deprecated aliases are skipped — only the 12 + umbrella commands are exposed as MCP tools. + """ tools = [] try: from commands import get_all_commands @@ -1607,6 +1617,8 @@ def _get_dynamic_tools(self) -> List[Dict[str, Any]]: continue if cmd_name in ("watch", "serve"): continue # Skip long-running commands + if cmd_info.get("hidden", False): + continue # Skip deprecated aliases (issue #195) tool_name = f"codelens_{cmd_name.replace('-', '_')}" tools.append({ "name": tool_name, @@ -2343,8 +2355,9 @@ def start_watcher(self, workspace: str) -> None: def _watch_loop(): """Background thread: poll for file changes and invalidate cache.""" - from commands.watch import _watch_polling - # Use the polling watcher's change detection + # Issue #195: was ``from commands.watch import _watch_polling`` + # (unused import — the loop below is self-contained). Removed + # so commands/watch.py can be deleted without breaking MCP. last_mtimes: Dict[str, float] = {} from utils import DEFAULT_IGNORE_DIRS _WATCH_EXTENSIONS = frozenset({ diff --git a/scripts/sync_command_count.py b/scripts/sync_command_count.py index fe3218e6..e2b2e382 100644 --- a/scripts/sync_command_count.py +++ b/scripts/sync_command_count.py @@ -75,23 +75,37 @@ def get_command_count() -> int: This is the single source of truth. Every doc, metadata file, and test sentinel must reconcile to this number. + + Issue #195: only counts visible (non-hidden) commands. Deprecated + aliases are still in ``COMMAND_REGISTRY`` so they remain callable for + backward compat, but they don't inflate the headline count. """ - return len(COMMAND_REGISTRY) + return sum(1 for info in COMMAND_REGISTRY.values() + if not info.get("hidden", False)) def get_mcp_counts() -> Tuple[int, int, int]: """Return ``(total, static, dynamic)`` MCP tool counts. - - ``total`` = every command except the long-running exclusions + - ``total`` = every visible command except the long-running exclusions (``watch`` + ``serve``) - - ``static`` = commands with an explicit schema in - ``mcp_server._TOOL_DEFINITIONS`` + - ``static`` = ``_TOOL_DEFINITIONS`` entries whose command name is + a visible (non-hidden) command (issue #195: static + tool definitions for hidden deprecated aliases are + excluded so they don't inflate the count) - ``dynamic`` = ``total - static`` (auto-discovered from COMMAND_REGISTRY) + + Issue #195: hidden deprecated aliases are excluded — they are not + exposed as MCP tools (the umbrella commands are). """ if not _MCP_AVAILABLE: return (0, 0, 0) - total = sum(1 for name in COMMAND_REGISTRY if name not in _MCP_EXCLUDED_COMMANDS) - static = len(_TOOL_DEFINITIONS) + visible_names = {name for name, info in COMMAND_REGISTRY.items() + if name not in _MCP_EXCLUDED_COMMANDS + and not info.get("hidden", False)} + total = len(visible_names) + # Only count static tools whose command is visible. + static = sum(1 for name in _TOOL_DEFINITIONS if name in visible_names) dynamic = total - static return (total, static, dynamic) diff --git a/skill.json b/skill.json index cf72f846..d178a521 100755 --- a/skill.json +++ b/skill.json @@ -1,7 +1,7 @@ { "name": "codelens", "version": "8.2.0", - "description": "Live Codebase Reference Intelligence. 78 commands for AI-powered code analysis, security auditing, quality scoring, and pre-write safety checks. Supports 28+ languages with regex+AST hybrid parsing. Must activate before writing/editing/deleting any class, id, or function.", + "description": "Live Codebase Reference Intelligence. 12 commands for AI-powered code analysis, security auditing, quality scoring, and pre-write safety checks. Supports 28+ languages with regex+AST hybrid parsing. Must activate before writing/editing/deleting any class, id, or function.", "author": "codelens", "command_categories": { "setup": [ diff --git a/tests/test_adr.py b/tests/test_adr.py index 9c66ad1c..e1e1a435 100644 --- a/tests/test_adr.py +++ b/tests/test_adr.py @@ -393,6 +393,7 @@ def test_dispatch_get_without_id_returns_structured_error(self, workspace): # ─── CLI command registration ────────────────────────────────────────────── +@pytest.mark.skip(reason="adr command dropped in issue #195 consolidation (adr.py deleted; adr_engine still tested above)") class TestCliCommandRegistration: """The ``adr`` command must auto-register from commands/adr.py.""" @@ -425,6 +426,7 @@ def test_execute_with_no_action_returns_usage_error(self, workspace): # ─── MCP tool registration ───────────────────────────────────────────────── +@pytest.mark.skip(reason="adr command + manage-adr MCP tool dropped in issue #195 consolidation") class TestMcpToolRegistration: """The ``manage-adr`` MCP tool must be statically defined.""" @@ -486,6 +488,7 @@ def test_adr_engine_has_file_header(self): assert "# @ENTRY:" in head def test_adr_command_has_file_header(self): + pytest.skip("adr.py deleted in issue #195 consolidation") path = os.path.join(SCRIPT_DIR, "commands", "adr.py") with open(path, "r", encoding="utf-8") as f: head = f.read(500) diff --git a/tests/test_command_count.py b/tests/test_command_count.py index 84ef42b7..ddbcd2e4 100644 --- a/tests/test_command_count.py +++ b/tests/test_command_count.py @@ -40,20 +40,28 @@ def _run_sync_check() -> subprocess.CompletedProcess: def test_command_count_helper_matches_runtime_registry(): - """The sync helper's reported count must equal len(COMMAND_REGISTRY). + """The sync helper's reported count must equal the visible (non-hidden) count. - This guards against the helper accidentally hardcoding a different number. + Issue #195: ``get_command_count`` counts only visible umbrella commands + (12), not the hidden deprecated aliases still present in + ``COMMAND_REGISTRY`` for backward compat. The headline count must match + what ``--help`` and ``--command-count`` show. """ sys.path.insert(0, _SCRIPTS_DIR) - from commands import COMMAND_REGISTRY # type: ignore + from commands import COMMAND_REGISTRY, get_visible_commands # type: ignore from sync_command_count import get_command_count # type: ignore - assert get_command_count() == len(COMMAND_REGISTRY) + visible = get_visible_commands() + assert get_command_count() == len(visible), ( + f"get_command_count()={get_command_count()} != " + f"len(visible)={len(visible)} (total registry={len(COMMAND_REGISTRY)})" + ) def test_mcp_tool_count_math_is_consistent(): - """MCP total = (commands not in {watch, serve}); static + dynamic = total. + """MCP total = (visible commands not in {watch, serve}); static + dynamic = total. + Issue #195: hidden deprecated aliases are excluded from MCP tool exposure. Catches the kind of drift that caused issue #38's MCP tool count inconsistency (README said 55, SKILL said 54, SKILL-QUICK said 58). """ @@ -62,10 +70,15 @@ def test_mcp_tool_count_math_is_consistent(): from sync_command_count import get_mcp_counts, _MCP_EXCLUDED_COMMANDS # type: ignore total, static, dynamic = get_mcp_counts() - expected_total = sum(1 for c in COMMAND_REGISTRY if c not in _MCP_EXCLUDED_COMMANDS) + # Expected = visible commands (non-hidden) minus excluded long-running ones. + expected_total = sum( + 1 for name, info in COMMAND_REGISTRY.items() + if name not in _MCP_EXCLUDED_COMMANDS + and not info.get("hidden", False) + ) assert total == expected_total, ( f"MCP total {total} != expected {expected_total} " - f"(commands minus excluded {sorted(_MCP_EXCLUDED_COMMANDS)})" + f"(visible commands minus excluded {sorted(_MCP_EXCLUDED_COMMANDS)})" ) assert static + dynamic == total, ( f"static({static}) + dynamic({dynamic}) != total({total})" diff --git a/tests/test_command_registry.py b/tests/test_command_registry.py index 82ba5760..533b3fab 100644 --- a/tests/test_command_registry.py +++ b/tests/test_command_registry.py @@ -17,11 +17,22 @@ def test_every_command_module_registers(): - """Each commands/*.py module must register at least one CLI command.""" + """Each commands/*.py module must register at least one CLI command. + + Issue #195: a small allowlist of utility modules (kept for backward + compat with tests/scripts but not registered as commands) is excluded. + """ + # Issue #195: migrate.py is a utility wrapper around + # PersistentRegistry.migrate_from_json, kept so existing tests that + # import cmd_migrate continue to work. It does NOT register a command + # (migrate was dropped per the consolidation). + _UTILITY_MODULES = {"migrate"} missing = [] for module_path in sorted(COMMANDS_DIR.glob("*.py")): if module_path.name == "__init__.py": continue + if module_path.stem in _UTILITY_MODULES: + continue module_name = f"commands.{module_path.stem}" registered = [ diff --git a/tests/test_hybrid_type_resolver.py b/tests/test_hybrid_type_resolver.py index 417687a4..3bcee8ab 100644 --- a/tests/test_hybrid_type_resolver.py +++ b/tests/test_hybrid_type_resolver.py @@ -481,8 +481,15 @@ def test_imports_edge_target_resolves_to_symbol_node( # ─── 7. resolve-types command ──────────────────────────────── +# Issue #195: resolve-types was dropped as a standalone command. +# The underlying HybridTypeResolver engine is still tested above. +# Skip the command-level tests since the command no longer exists. +import pytest as _pytest + + +@_pytest.mark.skip(reason="resolve-types command dropped in issue #195 consolidation") class TestResolveTypesCommand: """Verify the resolve-types CLI command.""" diff --git a/tests/test_issue195_consolidation.py b/tests/test_issue195_consolidation.py new file mode 100644 index 00000000..d80c0bb0 --- /dev/null +++ b/tests/test_issue195_consolidation.py @@ -0,0 +1,346 @@ +"""Tests for the 12 umbrella commands introduced in issue #195. + +Verifies: +- All 12 umbrella commands are registered and visible in COMMAND_REGISTRY. +- --help only shows the 12 umbrella commands (hidden aliases suppressed). +- --command-count reports 12. +- Each umbrella command's execute() returns the {s, st, r} shape. +- Deprecated aliases print a redirect warning to stderr when invoked. +""" + +from __future__ import annotations + +import os +import subprocess +import sys +import tempfile +from pathlib import Path + +import pytest + +SCRIPT_DIR = Path(__file__).resolve().parents[1] / "scripts" +if str(SCRIPT_DIR) not in sys.path: + sys.path.insert(0, str(SCRIPT_DIR)) + +from commands import COMMAND_REGISTRY, get_visible_commands + + +# ─── 1. Registry shape ────────────────────────────────────────────── + +EXPECTED_UMBRELLA = { + "scan", "search", "context", "deps", "audit", "security", + "summary", "impact", "api-map", "doctor", "history", "graph", +} + + +def test_12_umbrella_commands_registered(): + """All 12 umbrella commands must be registered (issue #195).""" + for name in EXPECTED_UMBRELLA: + assert name in COMMAND_REGISTRY, f"umbrella command {name!r} not registered" + + +def test_only_12_visible_commands(): + """Only the 12 umbrella commands are visible (non-hidden).""" + visible = get_visible_commands() + assert set(visible.keys()) == EXPECTED_UMBRELLA, ( + f"expected exactly {EXPECTED_UMBRELLA}, got {set(visible.keys())}" + ) + + +def test_absorbed_commands_marked_hidden_and_deprecated(): + """A sample of absorbed commands must be hidden + deprecated_alias_for set.""" + samples = { + "init": "scan", + "symbols": "search", + "semantic-query": "search", + "dead-code": "audit", + "complexity": "audit", + "secrets": "security", + "taint": "security", + "diff": "impact", + "dataflow": "impact", + "dashboard": "summary", + "arch-metrics": "summary", + "graph-schema": "api-map", + "env-check": "doctor", + "ownership": "history", + "git-status": "history", + "outline": "context", + "trace": "context", + "orient": "context", + "affected": "deps", + "dependents": "deps", + "circular": "deps", + "import-snapshot": "deps", + "staleness": "audit", + "perf-hint": "audit", + "side-effect": "audit", + "vuln-scan": "security", + "binary-scan": "security", + "regex-audit": "security", + "query-graph": "graph", + "architecture": "summary", + } + for old_name, umbrella in samples.items(): + assert old_name in COMMAND_REGISTRY, f"{old_name!r} not in registry" + info = COMMAND_REGISTRY[old_name] + assert info.get("hidden") is True, f"{old_name!r} not hidden" + assert info.get("deprecated_alias_for") == umbrella, ( + f"{old_name!r} deprecated_alias_for = {info.get('deprecated_alias_for')!r}, " + f"expected {umbrella!r}" + ) + + +def test_dropped_commands_not_registered(): + """Dropped commands must NOT be in the registry at all.""" + dropped = { + "adr", "a11y", "handbook", "ask", "serve", "sessions", "watch", + "registry-validate", "rule-test", "rule-validate", "artifact-scan", + "css-deep", "debug-leak", "detect", "export-snapshot", "refactor-safe", + "resolve-types", "stack-trace", "benchmark", "fix", "self-analyze", + "guard", "llm", "memory", + } + for name in dropped: + assert name not in COMMAND_REGISTRY, f"dropped command {name!r} still registered" + + +def test_lsp_status_hidden_redirects_to_doctor(): + """lsp-status is kept as a utility but hidden + deprecated for doctor.""" + assert "lsp-status" in COMMAND_REGISTRY + info = COMMAND_REGISTRY["lsp-status"] + assert info.get("hidden") is True + assert info.get("deprecated_alias_for") == "doctor" + + +# ─── 2. CLI smoke tests ───────────────────────────────────────────── + +def _run_cli(*args, expect_success=True): + """Run codelens as a subprocess and return the CompletedProcess.""" + env = os.environ.copy() + env["PYTHONPATH"] = str(SCRIPT_DIR) + env["PYTHONUTF8"] = "1" + env["CODELENS_STRICT_COMMANDS"] = "1" + result = subprocess.run( + [sys.executable, os.path.join(SCRIPT_DIR, "codelens.py"), *args], + capture_output=True, text=True, env=env, timeout=60, + ) + if expect_success and result.returncode != 0: + pytest.fail( + f"codelens {' '.join(args)} failed (exit {result.returncode}):\n" + f"stdout: {result.stdout}\nstderr: {result.stderr}" + ) + return result + + +def test_help_shows_only_12_umbrella_commands(): + """`codelens --help` must list exactly the 12 umbrella commands.""" + result = _run_cli("--help") + # Each umbrella command name must appear in the choices list. + for name in EXPECTED_UMBRELLA: + assert name in result.stdout, f"umbrella {name!r} not in --help" + # A sample of hidden aliases must NOT appear in the choices list. + # (They may appear in command body text if mentioned in epilogs, but + # argparse.SUPPRESS ensures they're not in the {choices} enumeration.) + hidden_samples = ["dead-code", "symbols", "secrets", "diff", "dashboard"] + for name in hidden_samples: + # The positional choices line is `{a11y,adr,affected,...}` — but + # hidden commands are suppressed so they won't be in that braces + # list. We check that the choices line doesn't contain them. + pass # argparse.SUPPRESS guarantees this; full verification via --command-count + + +def test_command_count_reports_12(): + """`codelens --command-count` must print exactly 12.""" + result = _run_cli("--command-count") + assert result.stdout.strip() == "12", ( + f"expected '12', got {result.stdout.strip()!r}" + ) + + +def test_deprecated_alias_prints_warning(): + """Invoking a deprecated alias must print a redirect warning to stderr.""" + # Use a simple workspace with no .codelens so the command fails fast + # but the deprecation warning is still emitted before execution. + with tempfile.TemporaryDirectory() as ws: + result = _run_cli("dead-code", ws, expect_success=False) + assert "DEPRECATED" in result.stderr, ( + f"deprecation warning not in stderr: {result.stderr!r}" + ) + assert "audit" in result.stderr, ( + f"redirect target 'audit' not in stderr: {result.stderr!r}" + ) + + +# ─── 3. Umbrella command execute() shape ──────────────────────────── + +def _make_workspace(): + """Create a minimal workspace with one Python file for testing.""" + ws = tempfile.mkdtemp(prefix="codelens_umbrella_") + with open(os.path.join(ws, "app.py"), "w") as f: + f.write("def hello():\n return 'world'\n") + return ws + + +def test_audit_umbrella_returns_unified_shape(): + """`audit` execute() returns {s, st, r} shape with _check tags.""" + import argparse + from commands.audit import execute as audit_execute, ALL_CHECKS + ws = _make_workspace() + args = argparse.Namespace( + workspace=ws, check="dead-code", max_files=10, max_results=5, + categories=None, severity=None, threshold=None, sort_by=None, + name=None, file=None, limit=None, category=None, + no_confirm_hash=False, format="json", top=None, max_tokens=None, + lite=False, deep=False, db_path=None, diff_base=None, diff_scope=None, + disable_suppression=None, codelens_ignore_pattern=None, + ) + result = audit_execute(args, ws) + assert "s" in result + assert "st" in result + assert "r" in result + assert isinstance(result["r"], list) + assert result["st"]["checks_requested"] == 1 + + +def test_deps_umbrella_returns_unified_shape(): + """`deps --check circular` execute() returns {s, st, r} shape.""" + import argparse + from commands.deps import execute as deps_execute + ws = _make_workspace() + args = argparse.Namespace( + workspace=ws, check="circular", files=None, depth=None, + filter=None, include_source=False, direction=None, domain=None, + max_cycles=None, input=None, merge=False, + format="json", top=None, max_tokens=None, lite=False, deep=False, + db_path=None, diff_base=None, diff_scope=None, + disable_suppression=None, codelens_ignore_pattern=None, + ) + result = deps_execute(args, ws) + assert "s" in result + assert "st" in result + assert "r" in result + assert isinstance(result["r"], list) + assert result["st"]["checks_requested"] == 1 + + +def test_security_umbrella_returns_unified_shape(): + """`security --check regex-audit` execute() returns {s, st, r} shape.""" + import argparse + from commands.security import execute as security_execute + ws = _make_workspace() + args = argparse.Namespace( + workspace=ws, check="regex-audit", max_files=10, severity=None, + no_gitleaks=False, language=None, with_secrets=False, + cross_file=False, no_ast=False, ast=False, deep=False, + offline=False, refresh=False, osv_ttl=None, max_age=None, + format="json", top=None, max_tokens=None, lite=False, + db_path=None, diff_base=None, diff_scope=None, + disable_suppression=None, codelens_ignore_pattern=None, + ) + result = security_execute(args, ws) + assert "s" in result + assert "st" in result + assert "r" in result + assert isinstance(result["r"], list) + + +def test_context_umbrella_returns_unified_shape(): + """`context --check orient` execute() returns {s, st, r} shape.""" + import argparse + from commands.context import execute as context_execute + ws = _make_workspace() + args = argparse.Namespace( + workspace=ws, check="orient", name=None, file=None, + all_files=False, detail=None, direction=None, depth=None, + domain=None, top=5, limit=None, offset=0, + format="json", max_tokens=None, lite=False, deep=False, + db_path=None, diff_base=None, diff_scope=None, + disable_suppression=None, codelens_ignore_pattern=None, + ) + result = context_execute(args, ws) + assert "s" in result + assert "st" in result + assert "r" in result + assert isinstance(result["r"], list) + + +def test_history_umbrella_dispatches_to_git_status(): + """`history --check git-status` dispatches to the git_status sub-command.""" + import argparse + from commands.history import execute as history_execute + ws = _make_workspace() + args = argparse.Namespace( + workspace=ws, check="git-status", chart=False, list=False, + compare=None, file=None, function_name=None, + format="json", top=None, max_tokens=None, lite=False, deep=False, + db_path=None, diff_base=None, diff_scope=None, + disable_suppression=None, codelens_ignore_pattern=None, + ) + result = history_execute(args, ws) + assert "s" in result + assert "r" in result + assert any(r.get("_check") == "git-status" for r in result["r"]) + + +def test_doctor_umbrella_dispatches_to_env_check(): + """`doctor --check env-check` dispatches to the env_check sub-command.""" + import argparse + from commands.doctor import execute as doctor_execute + ws = _make_workspace() + args = argparse.Namespace( + workspace=ws, check="env-check", fix=False, verbose=False, + format="json", var_name=None, + top=None, max_tokens=None, lite=False, deep=False, db_path=None, + diff_base=None, diff_scope=None, + disable_suppression=None, codelens_ignore_pattern=None, + ) + result = doctor_execute(args, ws) + assert "s" in result + assert "r" in result + assert any(r.get("_check") == "env-check" for r in result["r"]) + + +def test_api_map_umbrella_dispatches_to_graph_schema(): + """`api-map --check graph-schema` dispatches to graph_schema sub-command.""" + import argparse + from commands.api_map import execute as api_map_execute + ws = _make_workspace() + args = argparse.Namespace( + workspace=ws, check="graph-schema", method=None, path_filter=None, + production_only=False, db_path=None, + format="json", top=None, max_tokens=None, lite=False, deep=False, + diff_base=None, diff_scope=None, + disable_suppression=None, codelens_ignore_pattern=None, + ) + result = api_map_execute(args, ws) + assert "s" in result + assert "r" in result + # graph-schema may fail if no DB exists, but the _check tag must be present. + assert any(r.get("_check") == "graph-schema" for r in result["r"]) + + +def test_search_umbrella_symbol_mode(): + """`search --mode symbol` dispatches to the symbols engine.""" + import argparse + from commands.search import execute as search_execute + ws = _make_workspace() + args = argparse.Namespace( + pattern="hello", workspace=ws, mode="symbol", + file_type=None, file=None, max_results=200, context=0, + ignore_case=False, whole_word=False, domain="all", fuzzy=False, + top=None, validate=False, limit=20, offset=0, db_path=None, + format="json", max_tokens=None, lite=False, deep=False, + diff_base=None, diff_scope=None, + disable_suppression=None, codelens_ignore_pattern=None, + ) + result = search_execute(args, ws) + assert "s" in result + assert result["st"]["mode"] == "symbol" + + +def test_graph_umbrella_registered(): + """`graph` umbrella command is registered (raw Cypher power-user surface).""" + assert "graph" in COMMAND_REGISTRY + info = COMMAND_REGISTRY["graph"] + assert info.get("hidden") is not True # umbrella, must be visible + assert callable(info["execute"]) diff --git a/tests/test_llm.py b/tests/test_llm.py index 1d419123..639f3968 100644 --- a/tests/test_llm.py +++ b/tests/test_llm.py @@ -687,6 +687,7 @@ def test_frozen_input_is_hashable(self, clean_env): # ─── CLI command ─────────────────────────────────────────────────────────── +@pytest.mark.skip(reason="llm command dropped in issue #195 consolidation (llm_framework.py deleted)") class TestLlmCommand: """The ``codelens llm`` CLI command is registered and dispatches correctly.""" @@ -782,6 +783,7 @@ def test_unknown_subcommand_returns_error(self, clean_env): # ─── CLI subprocess smoke test ───────────────────────────────────────────── +@pytest.mark.skip(reason="llm command dropped in issue #195 consolidation — subprocess invocation will fail") class TestCLISmoke: """End-to-end: invoke ``codelens llm `` as a real subprocess.""" diff --git a/tests/test_memory.py b/tests/test_memory.py deleted file mode 100644 index 6f4b03d8..00000000 --- a/tests/test_memory.py +++ /dev/null @@ -1,605 +0,0 @@ -"""Tests for the Serena-style markdown memory system (issue #60). - -Covers: -- CRUD operations on project memory files (write/read/list/delete) -- Global memory fallback on read -- Global memory is read-only via CLI (write/delete reject it) -- File header is always present and canonical -- ``mem:NAME`` reference extraction + non-blocking validation (warn, not block) -- Name validation rejects invalid topic names -- The CLI ``memory`` command auto-registers and dispatches subcommands -""" - -from __future__ import annotations - -import os -import sys -import tempfile -from pathlib import Path - -import pytest - -# Make scripts/ importable. -SCRIPT_DIR = os.path.join( - os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "scripts" -) -if SCRIPT_DIR not in sys.path: - sys.path.insert(0, SCRIPT_DIR) - -from memories import memory_manager as mm # noqa: E402 -from commands import COMMAND_REGISTRY # noqa: E402 - - -# ─── Fixtures ────────────────────────────────────────────────────────────── - - -@pytest.fixture -def workspace(): - """Yield a temporary workspace directory.""" - d = tempfile.mkdtemp(prefix="codelens_memory_test_") - yield d - import shutil - shutil.rmtree(d, ignore_errors=True) - - -@pytest.fixture -def fake_home(monkeypatch, tmp_path): - """Redirect ``~`` to a temp dir so global memories don't touch the real home. - - Returns the path to the fake home directory. Tests that need a global - memory file should write to ``fake_home / ".codelens" / "memories" / "global"``. - """ - home = tmp_path / "fake_home" - home.mkdir() - monkeypatch.setenv("HOME", str(home)) - # os.path.expanduser caches nothing, but be explicit so any code reading - # HOME directly also gets the override. - monkeypatch.setattr(os.path, "expanduser", lambda p: str(home) if p == "~" else os.path.expanduser.__wrapped__(p) if hasattr(os.path.expanduser, "__wrapped__") else _expanduser(p, home)) - return home - - -def _expanduser(path: str, home: Path) -> str: - """Helper for monkeypatching os.path.expanduser without infinite recursion.""" - if path == "~": - return str(home) - if path.startswith("~/"): - return str(home / path[2:]) - return path - - -# ─── Name validation ─────────────────────────────────────────────────────── - - -class TestNameValidation: - """Names must match [A-Za-z][A-Za-z0-9_.-]* — same charset as mem:NAME.""" - - @pytest.mark.parametrize( - "name", - ["auth", "auth-flow", "auth_flow", "auth.flow", "auth2", "A", "a.b-c_d"], - ) - def test_valid_names_accepted(self, name): - mm._validate_name(name) # should not raise - # Path helpers should also accept these. - assert mm.project_memory_path("/ws", name).endswith(f"{name}.md") - assert mm.global_memory_path(name).endswith(f"{name}.md") - - @pytest.mark.parametrize( - "name", - [ - "", # empty - "1auth", # starts with digit - "-auth", # starts with hyphen - ".auth", # starts with dot - "_auth", # starts with underscore - "auth flow", # contains space - "auth/flow", # contains slash - "auth:flow", # contains colon - "auth$flow", # contains special char - ], - ) - def test_invalid_names_rejected(self, name): - with pytest.raises(ValueError): - mm._validate_name(name) - - def test_write_memory_rejects_invalid_name(self, workspace): - with pytest.raises(ValueError): - mm.write_memory(workspace, "1invalid", "content") - - def test_read_memory_rejects_invalid_name(self, workspace): - with pytest.raises(ValueError): - mm.read_memory(workspace, "1invalid") - - def test_delete_memory_rejects_invalid_name(self, workspace): - with pytest.raises(ValueError): - mm.delete_memory(workspace, "1invalid") - - -# ─── Header handling ─────────────────────────────────────────────────────── - - -class TestHeaderHandling: - """Every memory file must start with '# Memory: '.""" - - def test_write_adds_header_when_missing(self, workspace): - result = mm.write_memory(workspace, "topic", "Just body text.") - assert result["status"] == "ok" - with open(result["path"], "r", encoding="utf-8") as f: - content = f.read() - assert content.startswith("# Memory: topic\n") - assert "Just body text." in content - - def test_write_replaces_existing_header_with_canonical(self, workspace): - # Even if the user passes content with a different topic in the - # header, we overwrite with the canonical name. - result = mm.write_memory( - workspace, "real-name", "# Memory: wrong-name\n\nbody" - ) - with open(result["path"], "r", encoding="utf-8") as f: - content = f.read() - assert content.startswith("# Memory: real-name\n") - assert "wrong-name" not in content - assert "body" in content - - def test_write_preserves_body_when_only_header_given(self, workspace): - result = mm.write_memory(workspace, "t", "# Memory: t\n\nbody line 1\nbody line 2") - with open(result["path"], "r", encoding="utf-8") as f: - content = f.read() - assert "body line 1" in content - assert "body line 2" in content - - def test_write_idempotent(self, workspace): - """Writing the same content twice produces the same file.""" - mm.write_memory(workspace, "t", "body") - r1 = mm.read_memory(workspace, "t") - mm.write_memory(workspace, "t", "body") - r2 = mm.read_memory(workspace, "t") - assert r1["content"] == r2["content"] - assert r1["size_bytes"] == r2["size_bytes"] - - def test_has_valid_header(self): - assert mm.has_valid_header("# Memory: foo\n\nbody") - assert mm.has_valid_header(" # Memory: foo\nbody") - assert not mm.has_valid_header("No header here") - assert not mm.has_valid_header("") - assert not mm.has_valid_header("# Memory:\n") # missing topic - - def test_parse_header_topic(self): - assert mm.parse_header_topic("# Memory: auth-flow") == "auth-flow" - assert mm.parse_header_topic(" # Memory: spaced ") == "spaced" - assert mm.parse_header_topic("not a header") is None - assert mm.parse_header_topic("") is None - - -# ─── Write / Read / List / Delete ───────────────────────────────────────── - - -class TestWriteReadListDelete: - """Core CRUD lifecycle on project memory files.""" - - def test_write_creates_file_in_project_scope(self, workspace): - result = mm.write_memory(workspace, "auth", "Uses JWT.") - assert result["status"] == "ok" - assert result["action"] == "written" - assert result["scope"] == "project" - assert result["name"] == "auth" - assert result["path"] == os.path.join( - workspace, ".codelens", "memories", "auth.md" - ) - assert os.path.isfile(result["path"]) - assert result["size_bytes"] > 0 - - def test_write_creates_memories_dir(self, workspace): - memories_dir = os.path.join(workspace, ".codelens", "memories") - assert not os.path.exists(memories_dir) - mm.write_memory(workspace, "first", "content") - assert os.path.isdir(memories_dir) - - def test_write_updates_existing_file(self, workspace): - mm.write_memory(workspace, "topic", "v1") - r1 = mm.read_memory(workspace, "topic") - mm.write_memory(workspace, "topic", "v2 different content") - r2 = mm.read_memory(workspace, "topic") - assert r1["content"] != r2["content"] - assert "v2" in r2["content"] - - def test_read_returns_project_memory(self, workspace): - mm.write_memory(workspace, "topic", "Hello world") - result = mm.read_memory(workspace, "topic") - assert result["status"] == "ok" - assert result["scope"] == "project" - assert result["name"] == "topic" - assert "Hello world" in result["content"] - assert result["has_valid_header"] is True - assert result["header_topic"] == "topic" - - def test_read_returns_not_found_when_missing(self, workspace, fake_home): - result = mm.read_memory(workspace, "nonexistent") - assert result["status"] == "not_found" - assert "nonexistent" in result["message"] - - def test_read_falls_back_to_global(self, workspace, fake_home): - # Drop a global memory file directly (the only way to create one — - # write_memory only writes to project scope). - global_dir = fake_home / ".codelens" / "memories" / "global" - global_dir.mkdir(parents=True) - (global_dir / "global-topic.md").write_text( - "# Memory: global-topic\n\nGlobal content.\n", encoding="utf-8" - ) - - result = mm.read_memory(workspace, "global-topic") - assert result["status"] == "ok" - assert result["scope"] == "global" - assert "Global content." in result["content"] - - def test_read_prefers_project_over_global(self, workspace, fake_home): - # Both scopes have a memory with the same name. - global_dir = fake_home / ".codelens" / "memories" / "global" - global_dir.mkdir(parents=True) - (global_dir / "shared.md").write_text( - "# Memory: shared\n\nGlobal version.\n", encoding="utf-8" - ) - mm.write_memory(workspace, "shared", "Project version.") - - result = mm.read_memory(workspace, "shared") - assert result["status"] == "ok" - assert result["scope"] == "project" - assert "Project version." in result["content"] - - def test_list_empty_when_no_memories(self, workspace, fake_home): - result = mm.list_memories(workspace) - assert result["status"] == "ok" - assert result["total"] == 0 - assert result["project_count"] == 0 - assert result["global_count"] == 0 - assert result["memories"] == [] - - def test_list_returns_project_and_global(self, workspace, fake_home): - mm.write_memory(workspace, "p1", "project 1") - mm.write_memory(workspace, "p2", "project 2") - - global_dir = fake_home / ".codelens" / "memories" / "global" - global_dir.mkdir(parents=True) - (global_dir / "g1.md").write_text("# Memory: g1\n\nglobal 1\n", encoding="utf-8") - - result = mm.list_memories(workspace) - assert result["status"] == "ok" - assert result["project_count"] == 2 - assert result["global_count"] == 1 - assert result["total"] == 3 - - names = {m["name"] for m in result["memories"]} - assert names == {"p1", "p2", "g1"} - - # Each entry has the expected metadata. - for m in result["memories"]: - assert m["scope"] in ("project", "global") - assert m["path"] - assert m["size_bytes"] > 0 - assert m["modified_at"] - assert m["has_valid_header"] is True - assert m["header_topic"] == m["name"] - - def test_list_dedupes_with_project_taking_precedence(self, workspace, fake_home): - """A project memory shadows a global memory of the same name.""" - global_dir = fake_home / ".codelens" / "memories" / "global" - global_dir.mkdir(parents=True) - (global_dir / "shared.md").write_text( - "# Memory: shared\n\nglobal\n", encoding="utf-8" - ) - mm.write_memory(workspace, "shared", "project") - - result = mm.list_memories(workspace) - assert result["total"] == 1 - assert len(result["memories"]) == 1 - assert result["memories"][0]["scope"] == "project" - # The global list still shows the global entry. - assert result["global_count"] == 1 - assert result["project_count"] == 1 - - def test_delete_removes_project_memory(self, workspace): - mm.write_memory(workspace, "topic", "content") - path = os.path.join(workspace, ".codelens", "memories", "topic.md") - assert os.path.isfile(path) - - result = mm.delete_memory(workspace, "topic") - assert result["status"] == "ok" - assert result["action"] == "deleted" - assert result["scope"] == "project" - assert not os.path.exists(path) - - def test_delete_returns_not_found_for_missing_project(self, workspace): - result = mm.delete_memory(workspace, "never-existed") - assert result["status"] == "not_found" - assert "Global memories are read-only" in result["message"] - - def test_delete_cannot_delete_global(self, workspace, fake_home): - """Global memories are read-only via CLI — delete must refuse.""" - global_dir = fake_home / ".codelens" / "memories" / "global" - global_dir.mkdir(parents=True) - global_path = global_dir / "global-only.md" - global_path.write_text("# Memory: global-only\n\nbody\n", encoding="utf-8") - - # No project memory of the same name → not_found, file untouched. - result = mm.delete_memory(workspace, "global-only") - assert result["status"] == "not_found" - assert global_path.exists() # global file is untouched - - -# ─── Reference validation (warn, don't block) ───────────────────────────── - - -class TestReferenceValidation: - """``mem:NAME`` references are validated on write; missing refs warn.""" - - def test_extract_references_finds_all(self): - content = "See mem:auth and mem:tokens. Also mem:auth again." - refs = mm.extract_references(content) - # Deduplicated, order preserved. - assert refs == ["auth", "tokens"] - - def test_extract_references_ignores_non_matches(self): - # 'notmem:foo' should not match (leading word char before 'mem:'). - # 'mem:' alone has no name. 'mem:1foo' starts with digit, not allowed. - content = "notmem:foo mem: mem:1foo mem:valid" - refs = mm.extract_references(content) - assert refs == ["valid"] - - def test_extract_references_handles_special_chars_in_name(self): - content = "See mem:auth-flow, mem:tokens_v2, and mem:cache.hit" - refs = mm.extract_references(content) - assert "auth-flow" in refs - assert "tokens_v2" in refs - assert "cache.hit" in refs - - def test_extract_references_empty_content(self): - assert mm.extract_references("") == [] - assert mm.extract_references("no refs here") == [] - - def test_validate_references_no_missing(self, workspace): - mm.write_memory(workspace, "target", "target body") - result = mm.validate_references(workspace, "see mem:target") - assert result["references"] == ["target"] - assert result["missing"] == [] - assert result["warnings"] == [] - - def test_validate_references_reports_missing(self, workspace): - result = mm.validate_references(workspace, "see mem:nonexistent") - assert result["missing"] == ["nonexistent"] - assert "Reference 'mem:nonexistent' does not exist" in result["warnings"][0] - - def test_validate_references_exclude_self(self, workspace): - """When writing memory 'foo', a mem:foo self-reference is not flagged.""" - result = mm.validate_references( - workspace, "self-ref mem:foo and missing mem:bar", exclude="foo" - ) - assert "foo" in result["references"] - assert result["missing"] == ["bar"] # foo excluded - - def test_write_with_missing_reference_succeeds_with_warning(self, workspace): - """Issue #60: warn, don't block. The write must succeed.""" - result = mm.write_memory( - workspace, "login", "Login uses mem:auth-flow and mem:missing-topic." - ) - assert result["status"] == "ok" - assert result["action"] == "written" - # auth-flow doesn't exist either → both missing - assert "auth-flow" in result["missing_references"] - assert "missing-topic" in result["missing_references"] - assert any("auth-flow" in w for w in result["warnings"]) - assert any("missing-topic" in w for w in result["warnings"]) - # File was still written. - assert os.path.isfile(result["path"]) - - def test_write_with_existing_reference_no_warning(self, workspace): - """When all references exist, no warnings are emitted.""" - mm.write_memory(workspace, "auth", "auth body") - mm.write_memory(workspace, "tokens", "tokens body") - result = mm.write_memory( - workspace, "login", "Login uses mem:auth and mem:tokens." - ) - assert result["status"] == "ok" - assert "missing_references" not in result - assert "warnings" not in result - assert set(result["references"]) == {"auth", "tokens"} - - def test_write_with_self_reference_no_warning(self, workspace): - """A memory that references itself doesn't warn.""" - result = mm.write_memory( - workspace, "recursive", "see mem:recursive for details" - ) - assert result["status"] == "ok" - assert "recursive" in result["references"] - assert "missing_references" not in result - - def test_read_includes_references(self, workspace): - mm.write_memory(workspace, "topic", "see mem:other") - result = mm.read_memory(workspace, "topic") - assert result["references"] == ["other"] - - def test_validate_references_falls_back_to_global(self, workspace, fake_home): - """A reference to a global memory should not be flagged missing.""" - global_dir = fake_home / ".codelens" / "memories" / "global" - global_dir.mkdir(parents=True) - (global_dir / "global-topic.md").write_text( - "# Memory: global-topic\n\nbody\n", encoding="utf-8" - ) - result = mm.validate_references(workspace, "see mem:global-topic") - assert result["missing"] == [] - assert result["warnings"] == [] - - -# ─── Path helpers ───────────────────────────────────────────────────────── - - -class TestPathHelpers: - """Path helpers return deterministic, validated paths.""" - - def test_project_memory_dir(self, workspace): - assert mm.project_memory_dir(workspace) == os.path.join( - workspace, ".codelens", "memories" - ) - - def test_global_memory_dir_uses_home(self, fake_home): - assert mm.global_memory_dir() == str( - fake_home / ".codelens" / "memories" / "global" - ) - - def test_project_memory_path_validates_name(self): - with pytest.raises(ValueError): - mm.project_memory_path("/ws", "1invalid") - - def test_global_memory_path_validates_name(self): - with pytest.raises(ValueError): - mm.global_memory_path("1invalid") - - -# ─── CLI command registration & dispatch ────────────────────────────────── - - -class TestCommandRegistration: - """The ``memory`` command must auto-register via register_command().""" - - def test_memory_command_registered(self): - assert "memory" in COMMAND_REGISTRY - info = COMMAND_REGISTRY["memory"] - assert "Serena-style" in info["help"] - assert callable(info["add_args"]) - assert callable(info["execute"]) - - def test_command_registered_with_canonical_name(self): - """The execute function must belong to commands.memory module. - - This guards against the test_every_command_module_registers test in - test_command_registry.py — each commands/*.py must register at least - one command, and the registration must come from that module. - """ - info = COMMAND_REGISTRY["memory"] - assert info["execute"].__module__ == "commands.memory" - - -class TestCommandDispatch: - """End-to-end dispatch through the registered ``execute`` callback.""" - - def _parse_and_run(self, subcommand_args, workspace): - """Build an argparse namespace the way codelens.py does and dispatch.""" - import argparse - from commands.memory import add_args, execute - - parser = argparse.ArgumentParser(prog="codelens memory") - add_args(parser) - # The framework adds --format etc. to subparsers; we don't need them - # for these tests since execute() doesn't read them. - args = parser.parse_args(subcommand_args) - return execute(args, workspace) - - def test_no_action_returns_error(self, workspace): - result = self._parse_and_run([], workspace) - assert result["status"] == "error" - assert "No memory action" in result["error"] - - def test_write_via_dispatch(self, workspace): - result = self._parse_and_run( - ["write", "topic", "body content"], workspace - ) - assert result["status"] == "ok" - assert result["scope"] == "project" - - def test_read_via_dispatch(self, workspace): - self._parse_and_run(["write", "topic", "body content"], workspace) - result = self._parse_and_run(["read", "topic"], workspace) - assert result["status"] == "ok" - assert "body content" in result["content"] - - def test_list_via_dispatch(self, workspace): - self._parse_and_run(["write", "a", "alpha"], workspace) - self._parse_and_run(["write", "b", "beta"], workspace) - result = self._parse_and_run(["list"], workspace) - assert result["status"] == "ok" - assert result["total"] == 2 - - def test_delete_via_dispatch(self, workspace): - self._parse_and_run(["write", "topic", "body"], workspace) - result = self._parse_and_run(["delete", "topic"], workspace) - assert result["status"] == "ok" - assert result["action"] == "deleted" - - def test_unknown_action_returns_error(self, workspace): - # Argparse will reject unknown subcommands before execute() is called, - # but if memory_action is somehow None or unknown we should still - # return a structured error rather than crashing. - result = self._parse_and_run([], workspace) - assert result["status"] == "error" - - -# ─── CLI subprocess smoke test ──────────────────────────────────────────── - - -class TestCLISubprocess: - """Smoke-test the memory command end-to-end through codelens.py.""" - - def test_memory_command_runs_via_cli(self, workspace): - """`codelens memory write ...` end-to-end through codelens.py.""" - import subprocess - import json - - # Drop a project marker so the auto-detector finds this temp dir - # rather than walking up to the real CodeLens checkout. We also - # redirect HOME so the last-workspace cache (~/.codelens/...) doesn't - # leak across tests / pollute the developer's real home dir. - (Path(workspace) / "pyproject.toml").write_text( - "[project]\nname = 'test-ws'\nversion = '0'\n", encoding="utf-8" - ) - fake_home = Path(workspace) / "fake_home" - fake_home.mkdir() - - env = { - **os.environ, - "PYTHONPATH": SCRIPT_DIR, - "PYTHONUTF8": "1", - "HOME": str(fake_home), - } - codelens = os.path.join(SCRIPT_DIR, "codelens.py") - - # write - r = subprocess.run( - [sys.executable, codelens, "memory", "write", "topic", "body text"], - capture_output=True, text=True, env=env, cwd=workspace, timeout=30, - ) - assert r.returncode == 0, f"write failed: {r.stderr[:300]}" - - # The output should be JSON (after any [CodeLens] stderr lines). - stdout_lines = [ - line for line in r.stdout.splitlines() - if not line.startswith("[CodeLens]") - ] - data = json.loads("\n".join(stdout_lines)) - assert data["status"] == "ok" - # File must have landed inside the temp workspace, not somewhere else. - assert data["path"].startswith(workspace), ( - f"memory file written outside temp workspace: {data['path']}" - ) - assert os.path.isfile(data["path"]) - - def test_memory_no_action_via_cli(self): - """`codelens memory` with no action returns a structured error.""" - import subprocess - import json - - env = { - **os.environ, - "PYTHONPATH": SCRIPT_DIR, - "PYTHONUTF8": "1", - } - codelens = os.path.join(SCRIPT_DIR, "codelens.py") - r = subprocess.run( - [sys.executable, codelens, "memory"], - capture_output=True, text=True, env=env, timeout=30, - ) - assert r.returncode == 0 - stdout_lines = [ - line for line in r.stdout.splitlines() - if not line.startswith("[CodeLens]") - ] - data = json.loads("\n".join(stdout_lines)) - assert data["status"] == "error" - assert "No memory action" in data["error"] diff --git a/tests/test_sessions.py b/tests/test_sessions.py deleted file mode 100644 index e089df89..00000000 --- a/tests/test_sessions.py +++ /dev/null @@ -1,412 +0,0 @@ -"""Tests for the ``codelens sessions`` command (issue #64, Phase 2). - -Covers: - -* Session log reading from both JSON sidecar and Markdown fallback. -* ``--entries N`` filtering (last N sessions, most-recent-first). -* ``--json`` machine-readable output. -* ``--raw`` verbatim Markdown output. -* ``--config-dir`` custom location (for test isolation). -* Rotation: when the JSON sidecar exceeds 1 MB, trim to last 50. -* ``setup.sh`` integration: running setup.sh appends a session entry - to both ``session.md`` and ``session.json``. - -The tests use ``tmp_path`` + ``--config-dir`` for isolation — they -do NOT touch the user's real ``~/.codelens/`` directory. -""" - -from __future__ import annotations - -import json -import os -import subprocess -import sys -import time -from datetime import datetime, timezone -from typing import Dict, List - -import pytest - -SCRIPTS_DIR = os.path.join( - os.path.dirname(os.path.dirname(os.path.abspath(__file__))), - "scripts", -) -if SCRIPTS_DIR not in sys.path: - sys.path.insert(0, SCRIPTS_DIR) - -from commands import COMMAND_REGISTRY # noqa: E402 -from commands import sessions as sessions_module # noqa: E402 - - -# ─── Registration ────────────────────────────────────────────── - - -def test_sessions_is_registered(): - """sessions must be in the runtime COMMAND_REGISTRY.""" - assert "sessions" in COMMAND_REGISTRY - info = COMMAND_REGISTRY["sessions"] - assert "session" in info["help"].lower() - - -# ─── Helpers ─────────────────────────────────────────────────── - - -def _make_session(idx: int, **overrides) -> Dict: - """Build a synthetic session dict for test fixtures.""" - base = { - "timestamp": f"2026-06-{28 + idx:02d}T09:14:3{idx}Z", - "duration_sec": 10 + idx, - "exit_code": 0, - "python": "3.12.0", - "os": "Linux", - "arch": "x86_64", - "agents_detected": ["claude-code"] if idx % 2 == 0 else [], - "deps_installed": ["tree-sitter", "tree-sitter-python"], - "warnings": None if idx % 3 != 0 else f"warning on session {idx}", - "errors": None, - "title": "setup", - } - base.update(overrides) - return base - - -def _write_sessions(config_dir: str, sessions: List[Dict]) -> None: - """Write a list of sessions to the JSON sidecar (and MD log).""" - os.makedirs(config_dir, exist_ok=True) - json_path = os.path.join(config_dir, sessions_module.SESSION_JSON_FILENAME) - md_path = os.path.join(config_dir, sessions_module.SESSION_MD_FILENAME) - with open(json_path, "w", encoding="utf-8") as f: - json.dump(sessions, f, indent=2, ensure_ascii=False) - # Write a minimal MD log too so --raw has something to show. - with open(md_path, "w", encoding="utf-8") as f: - f.write("# CodeLens install sessions\n\n") - for s in sessions: - f.write(f"## {s['timestamp']} — {s.get('title', 'setup')}\n\n") - for k, v in s.items(): - if k in ("timestamp", "title"): - continue - f.write(f"- **{k}**: {v}\n") - f.write("\n") - - -def _run_sessions_cmd(config_dir: str, *extra_args) -> Dict: - """Invoke sessions.execute() with a synthetic args namespace.""" - args = type("Args", (), {})() - args.workspace = None - args.entries = sessions_module.DEFAULT_ENTRIES - args.raw = False - args.json_output = False - args.config_dir = config_dir - for i, a in enumerate(extra_args): - if a == "--entries": - args.entries = int(extra_args[i + 1]) - elif a == "--raw": - args.raw = True - elif a == "--json": - args.json_output = True - elif a == "--config-dir": - args.config_dir = extra_args[i + 1] - return sessions_module.execute(args, "") - - -# ─── Reading sessions ────────────────────────────────────────── - - -class TestReadSessions: - """Verify the JSON sidecar is the preferred read source.""" - - def test_empty_config_dir_returns_empty_list(self, tmp_path): - result = _run_sessions_cmd(str(tmp_path)) - assert result["status"] == "ok" - assert result["total_sessions"] == 0 - assert result["returned_sessions"] == 0 - assert result["sessions"] == [] - - def test_reads_from_json_sidecar(self, tmp_path): - sessions = [_make_session(0), _make_session(1)] - _write_sessions(str(tmp_path), sessions) - result = _run_sessions_cmd(str(tmp_path)) - assert result["total_sessions"] == 2 - assert result["returned_sessions"] == 2 - - def test_falls_back_to_md_when_json_missing(self, tmp_path): - """If the JSON sidecar is missing/empty, parse the Markdown log.""" - md_path = os.path.join(str(tmp_path), sessions_module.SESSION_MD_FILENAME) - with open(md_path, "w", encoding="utf-8") as f: - f.write("# CodeLens install sessions\n\n") - f.write("## 2026-06-28T09:14:31Z — setup\n\n") - f.write("- **duration_sec**: 42\n") - f.write("- **python**: 3.12.0\n\n") - result = _run_sessions_cmd(str(tmp_path)) - assert result["status"] == "ok" - # MD parsing is best-effort — should at least find 1 session. - assert result["total_sessions"] >= 1 - - def test_falls_back_to_md_when_json_corrupt(self, tmp_path): - """A corrupt JSON sidecar should not crash — fall back to MD.""" - json_path = os.path.join(str(tmp_path), sessions_module.SESSION_JSON_FILENAME) - with open(json_path, "w", encoding="utf-8") as f: - f.write("{not valid json") - md_path = os.path.join(str(tmp_path), sessions_module.SESSION_MD_FILENAME) - with open(md_path, "w", encoding="utf-8") as f: - f.write("# CodeLens install sessions\n\n") - f.write("## 2026-06-28T09:14:31Z — setup\n\n") - result = _run_sessions_cmd(str(tmp_path)) - assert result["status"] == "ok" - - def test_md_only_no_sessions_returns_empty(self, tmp_path): - """If neither file exists, return empty — don't crash.""" - result = _run_sessions_cmd(str(tmp_path)) - assert result["total_sessions"] == 0 - - -# ─── --entries filtering ─────────────────────────────────────── - - -class TestEntriesFilter: - """``--entries N`` shows only the last N sessions.""" - - def test_entries_limits_to_last_n(self, tmp_path): - sessions = [_make_session(i) for i in range(10)] - _write_sessions(str(tmp_path), sessions) - result = _run_sessions_cmd(str(tmp_path), "--entries", "3") - # total still 10, but only 3 returned - assert result["total_sessions"] == 10 - assert result["returned_sessions"] == 3 - # The 3 returned should be the LAST 3 (most recent). - returned_timestamps = [s["timestamp"] for s in result["sessions"]] - assert returned_timestamps == [sessions[7]["timestamp"], - sessions[8]["timestamp"], - sessions[9]["timestamp"]] - - def test_entries_zero_returns_all(self, tmp_path): - """``--entries 0`` means "all sessions".""" - sessions = [_make_session(i) for i in range(7)] - _write_sessions(str(tmp_path), sessions) - result = _run_sessions_cmd(str(tmp_path), "--entries", "0") - assert result["returned_sessions"] == 7 - - def test_entries_larger_than_total_returns_all(self, tmp_path): - sessions = [_make_session(0), _make_session(1)] - _write_sessions(str(tmp_path), sessions) - result = _run_sessions_cmd(str(tmp_path), "--entries", "100") - assert result["returned_sessions"] == 2 - - def test_default_entries_is_5(self, tmp_path): - """Without --entries, the default is 5.""" - sessions = [_make_session(i) for i in range(10)] - _write_sessions(str(tmp_path), sessions) - result = _run_sessions_cmd(str(tmp_path)) - assert result["returned_sessions"] == 5 - - -# ─── --json output ───────────────────────────────────────────── - - -class TestJsonOutput: - """``--json`` produces a valid JSON array on stdout.""" - - def test_json_output_is_valid_array(self, tmp_path, capsys): - sessions = [_make_session(0), _make_session(1)] - _write_sessions(str(tmp_path), sessions) - _run_sessions_cmd(str(tmp_path), "--json") - captured = capsys.readouterr() - # The printed output should be a valid JSON array. - # Strip any stderr noise from stdout. - out = captured.out.strip() - data = json.loads(out) - assert isinstance(data, list) - assert len(data) == 2 - assert data[0]["timestamp"] == sessions[0]["timestamp"] - - def test_json_output_respects_entries(self, tmp_path, capsys): - sessions = [_make_session(i) for i in range(8)] - _write_sessions(str(tmp_path), sessions) - _run_sessions_cmd(str(tmp_path), "--json", "--entries", "2") - captured = capsys.readouterr() - data = json.loads(captured.out.strip()) - assert len(data) == 2 - - -# ─── --raw output ────────────────────────────────────────────── - - -class TestRawOutput: - """``--raw`` prints the Markdown log verbatim.""" - - def test_raw_prints_md_content(self, tmp_path, capsys): - sessions = [_make_session(0)] - _write_sessions(str(tmp_path), sessions) - _run_sessions_cmd(str(tmp_path), "--raw") - captured = capsys.readouterr() - # The raw MD should contain the heading we wrote. - assert "# CodeLens install sessions" in captured.out - assert sessions[0]["timestamp"] in captured.out - - def test_raw_when_no_log_prints_not_found_message(self, tmp_path, capsys): - _run_sessions_cmd(str(tmp_path), "--raw") - captured = capsys.readouterr() - assert "not found" in captured.out.lower() or "no install sessions" in captured.out.lower() - - -# ─── Rotation ────────────────────────────────────────────────── - - -class TestRotation: - """When the JSON sidecar exceeds 1 MB, trim to last 50 sessions.""" - - def test_no_rotation_under_threshold(self, tmp_path): - sessions = [_make_session(i) for i in range(10)] - _write_sessions(str(tmp_path), sessions) - result = _run_sessions_cmd(str(tmp_path)) - assert result["rotated"] is False - assert result["total_sessions"] == 10 - - def test_rotation_when_over_threshold(self, tmp_path): - """Force the JSON sidecar over 1 MB by making sessions large.""" - # Build 100 sessions with big bodies to exceed 1 MB. - big_body = "x" * 20_000 # 20 KB per session → 100 sessions = ~2 MB - sessions = [] - for i in range(100): - s = _make_session(i) - s["body"] = big_body - sessions.append(s) - _write_sessions(str(tmp_path), sessions) - # Call with --entries 0 so all (post-rotation) sessions are - # returned — otherwise the default --entries 5 would limit - # the result and we couldn't verify the rotation count. - result = _run_sessions_cmd(str(tmp_path), "--entries", "0") - assert result["rotated"] is True - # After rotation, only 50 sessions remain. - assert result["total_sessions"] == 50 - assert result["returned_sessions"] == 50 - # The 50 kept should be the most recent (indices 50-99). - kept_timestamps = [s["timestamp"] for s in result["sessions"]] - # Take the last 50 of the original (indices 50-99). - expected_last_50 = sessions[-50:] - expected_timestamps = [s["timestamp"] for s in expected_last_50] - assert sorted(kept_timestamps) == sorted(expected_timestamps) - - def test_rotation_is_idempotent(self, tmp_path): - """A second call after rotation should not rotate again.""" - big_body = "x" * 20_000 - sessions = [_make_session(i) for i in range(100)] - for s in sessions: - s["body"] = big_body - _write_sessions(str(tmp_path), sessions) - first = _run_sessions_cmd(str(tmp_path)) - assert first["rotated"] is True - # Second call — already trimmed, under threshold. - second = _run_sessions_cmd(str(tmp_path)) - assert second["rotated"] is False - - -# ─── setup.sh integration ────────────────────────────────────── - - -class TestSetupShIntegration: - """End-to-end: running setup.sh appends a session entry.""" - - def test_setup_sh_appends_to_session_md(self, tmp_path): - """After running setup.sh, ``session.md`` should have a new entry.""" - repo_root = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) - env = {**os.environ, "CODELENS_CONFIG_DIR": str(tmp_path)} - result = subprocess.run( - ["bash", os.path.join(repo_root, "setup.sh")], - capture_output=True, text=True, env=env, timeout=120, - ) - # setup.sh may fail on missing tree-sitter (test env), but it - # should still have written a session entry. The exit code - # reflects whether the install succeeded, not whether the log - # was written. - md_path = os.path.join(str(tmp_path), sessions_module.SESSION_MD_FILENAME) - assert os.path.exists(md_path), ( - f"session.md not created.\nstdout: {result.stdout}\nstderr: {result.stderr}" - ) - content = open(md_path, encoding="utf-8").read() - assert "## " in content # at least one session heading - assert "duration_sec" in content - - def test_setup_sh_appends_to_session_json(self, tmp_path): - """After running setup.sh, ``session.json`` should be a valid JSON array.""" - repo_root = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) - env = {**os.environ, "CODELENS_CONFIG_DIR": str(tmp_path)} - subprocess.run( - ["bash", os.path.join(repo_root, "setup.sh")], - capture_output=True, text=True, env=env, timeout=120, - ) - json_path = os.path.join(str(tmp_path), sessions_module.SESSION_JSON_FILENAME) - assert os.path.exists(json_path) - with open(json_path, encoding="utf-8") as f: - data = json.load(f) - assert isinstance(data, list) - assert len(data) >= 1 - entry = data[-1] - # Every entry should have these required fields. - for key in ("timestamp", "duration_sec", "exit_code", "python", "os", "arch"): - assert key in entry, f"missing key in session entry: {key}" - - def test_multiple_setup_runs_append_multiple_sessions(self, tmp_path): - """Running setup.sh twice should produce 2 session entries.""" - repo_root = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) - env = {**os.environ, "CODELENS_CONFIG_DIR": str(tmp_path)} - for _ in range(2): - subprocess.run( - ["bash", os.path.join(repo_root, "setup.sh")], - capture_output=True, text=True, env=env, timeout=120, - ) - json_path = os.path.join(str(tmp_path), sessions_module.SESSION_JSON_FILENAME) - with open(json_path, encoding="utf-8") as f: - data = json.load(f) - assert len(data) == 2 - - -# ─── CLI smoke test ──────────────────────────────────────────── - - -class TestCLISmoke: - """End-to-end: invoke ``codelens sessions`` as a real subprocess.""" - - def _run_cli(self, *extra_args): - env = os.environ.copy() - env["PYTHONPATH"] = SCRIPTS_DIR - return subprocess.run( - [sys.executable, os.path.join(SCRIPTS_DIR, "codelens.py"), "sessions", *extra_args], - capture_output=True, text=True, env=env, timeout=30, - ) - - def test_sessions_cli_runs_without_crash(self, tmp_path): - result = self._run_cli("--config-dir", str(tmp_path)) - assert result.returncode == 0 - assert "sessions" in result.stdout.lower() - - def test_sessions_cli_with_existing_log(self, tmp_path): - sessions = [_make_session(0)] - _write_sessions(str(tmp_path), sessions) - result = self._run_cli("--config-dir", str(tmp_path)) - assert result.returncode == 0 - assert "showing 1 of 1" in result.stdout - - def test_sessions_cli_json_mode(self, tmp_path): - sessions = [_make_session(0)] - _write_sessions(str(tmp_path), sessions) - result = self._run_cli("--config-dir", str(tmp_path), "--json") - assert result.returncode == 0 - data = json.loads(result.stdout.strip()) - assert isinstance(data, list) - assert len(data) == 1 - - -# ─── Default config dir ──────────────────────────────────────── - - -class TestDefaultConfigDir: - """When --config-dir is not passed, defaults to ~/.codelens.""" - - def test_default_config_dir_used_when_not_specified(self): - """The result dict should report the default config dir.""" - # Don't actually run the command against the real ~/.codelens — - # just verify the default is picked up correctly by setting - # an env override (sessions.py reads DEFAULT_CONFIG_DIR at - # module import, so we test the constant directly). - assert sessions_module.DEFAULT_CONFIG_DIR.endswith(".codelens") diff --git a/tests/test_snapshot.py b/tests/test_snapshot.py deleted file mode 100644 index 51bb1044..00000000 --- a/tests/test_snapshot.py +++ /dev/null @@ -1,487 +0,0 @@ -"""Round-trip tests for export-snapshot + import-snapshot (issue #12). - -Verifies that exporting a CodeLens graph snapshot and importing it into -a fresh workspace produces a database with identical graph metadata: -same node/edge/symbol/ref/file rows (modulo the autoincrement ``id`` -column, which is intentionally not preserved across export/import). - -Also covers: -- The export ``"Snapshot exported: ... (N.N MB)"`` message format. -- ``--merge`` deduplication (importing the same snapshot twice does not - duplicate rows). -- Version-mismatch validation warnings. -- The constraint that the snapshot contains metadata only (no file - content is stored in any of the exported tables). -""" - -import json -import os -import shutil -import sqlite3 -import sys -import tempfile - -import pytest - -# Add scripts directory to path (matches other test files). -SCRIPT_DIR = os.path.join( - os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "scripts" -) -sys.path.insert(0, SCRIPT_DIR) - -from commands import COMMAND_REGISTRY # noqa: E402 -from snapshot_io import ( # noqa: E402 - DEFAULT_SNAPSHOT_FILENAME, - SNAPSHOT_TABLES, - TABLE_COLUMNS, - default_snapshot_path, - format_size, -) - - -# ─── Fixtures ───────────────────────────────────────────────── - - -@pytest.fixture -def workspace_a(): - """A temp workspace with a populated CodeLens database. - - The DB schema is initialized via PersistentRegistry (which also - creates the graph_* tables), then a small known set of rows is - inserted directly so the round-trip has deterministic data to - compare against. - """ - workspace = tempfile.mkdtemp(prefix="codelens_snap_export_") - try: - _populate_workspace(workspace) - yield workspace - finally: - shutil.rmtree(workspace, ignore_errors=True) - - -@pytest.fixture -def workspace_b(): - """A fresh, empty temp workspace (import target).""" - workspace = tempfile.mkdtemp(prefix="codelens_snap_import_") - try: - yield workspace - finally: - shutil.rmtree(workspace, ignore_errors=True) - - -def _populate_workspace(workspace: str) -> None: - """Initialize the DB schema and insert a small known graph.""" - from persistent_registry import PersistentRegistry - - # Initializing the registry creates all tables (symbols, refs, files, - # analysis_cache, scan_metadata + graph_nodes + graph_edges). - reg = PersistentRegistry(workspace) - reg._connect() - reg.close() - - db_path = os.path.join(workspace, ".codelens", "codelens.db") - conn = sqlite3.connect(db_path) - try: - # graph_nodes — (node_id, node_type, name, file, line, extra_json) - conn.executemany( - "INSERT INTO graph_nodes " - "(node_id, node_type, name, file, line, extra_json) " - "VALUES (?, ?, ?, ?, ?, ?)", - [ - ("src/app.py:10:main", "function", "main", "src/app.py", 10, None), - ("src/app.py:20:helper", "function", "helper", "src/app.py", 20, - json.dumps({"async": True})), - ("src/models.py:5:User", "class", "User", "src/models.py", 5, None), - ], - ) - # graph_edges — (source_id, target_id, edge_type, file, line, confidence, extra_json) - conn.executemany( - "INSERT INTO graph_edges " - "(source_id, target_id, edge_type, file, line, confidence, extra_json) " - "VALUES (?, ?, ?, ?, ?, ?, ?)", - [ - ("src/app.py:10:main", "src/app.py:20:helper", "CALLS", - "src/app.py", 10, 1.0, None), - ("src/app.py:10:main", "src/models.py:5:User", "USES_TYPE", - "src/app.py", 12, 0.9, json.dumps({"to_fn": "User"})), - ], - ) - # symbols — (name, kind, file_path, line_start, line_end, language, signature, hash, extra_json) - conn.executemany( - "INSERT INTO symbols " - "(name, kind, file_path, line_start, line_end, language, signature, hash, extra_json) " - "VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)", - [ - ("main", "function", "src/app.py", 10, 15, "python", - "def main()", "sha1:abc", None), - ("User", "class", "src/models.py", 5, 30, "python", - "class User", "sha1:def", None), - ], - ) - # refs — (source_symbol, target_symbol, reference_type, source_file, extra_json) - conn.executemany( - "INSERT INTO refs " - "(source_symbol, target_symbol, reference_type, source_file, extra_json) " - "VALUES (?, ?, ?, ?, ?)", - [ - ("src/app.py:10:main", "src/app.py:20:helper", "call", - "src/app.py", None), - ], - ) - # files — (file_path, language, last_modified, content_hash, last_scanned) - conn.executemany( - "INSERT INTO files " - "(file_path, language, last_modified, content_hash, last_scanned) " - "VALUES (?, ?, ?, ?, ?)", - [ - ("src/app.py", "python", 1700000000.0, "sha1:aaa", 1700000000.0), - ("src/models.py", "python", 1700000001.0, "sha1:bbb", 1700000001.0), - ], - ) - # scan_metadata — (id=1, workspace, scan_timestamp, total_files, version) - conn.execute( - "INSERT OR REPLACE INTO scan_metadata " - "(id, workspace, scan_timestamp, total_files, version) " - "VALUES (1, ?, ?, ?, ?)", - (workspace, 1700000000.0, 2, 1), - ) - conn.commit() - finally: - conn.close() - - -def _table_rows(db_path: str, table: str, exclude_id: bool = True): - """Return rows from ``table`` as a sorted list of tuples. - - The autoincrement ``id`` column is excluded by default so rows can be - compared across databases (import assigns fresh ids). Rows are sorted - for deterministic comparison. - """ - conn = sqlite3.connect(db_path) - try: - cols = TABLE_COLUMNS[table] - if exclude_id: - cols = [c for c in cols if c != "id"] - col_list = ", ".join('"' + c + '"' for c in cols) - rows = conn.execute(f'SELECT {col_list} FROM "{table}"').fetchall() - # Normalize: convert each row to a tuple of JSON-stringifiable values - # so json-encoded extra_json strings compare equal. - return sorted(tuple(r[i] for i in range(len(cols))) for r in rows) - finally: - conn.close() - - -# ─── Registration ───────────────────────────────────────────── - - -class TestCommandRegistration: - """Both commands must auto-register via register_command().""" - - def test_export_snapshot_registered(self): - assert "export-snapshot" in COMMAND_REGISTRY - info = COMMAND_REGISTRY["export-snapshot"] - assert info["help"] - assert callable(info["add_args"]) - assert callable(info["execute"]) - - def test_import_snapshot_registered(self): - assert "import-snapshot" in COMMAND_REGISTRY - info = COMMAND_REGISTRY["import-snapshot"] - assert info["help"] - assert callable(info["add_args"]) - assert callable(info["execute"]) - - -# ─── Round-trip ─────────────────────────────────────────────── - - -class TestRoundTrip: - """export → import → query produces the same results.""" - - def test_round_trip_preserves_all_tables(self, workspace_a, workspace_b): - """Export from A, import into B, then every table must match.""" - from commands.export_snapshot import cmd_export_snapshot - from commands.import_snapshot import cmd_import_snapshot - - snapshot_path = os.path.join(workspace_a, ".codelens", - DEFAULT_SNAPSHOT_FILENAME) - - # ── Export from workspace A ── - export_result = cmd_export_snapshot(workspace_a) - assert export_result["status"] == "ok", export_result - assert os.path.exists(snapshot_path), "snapshot file was not written" - - header = export_result["header"] - assert header["node_count"] == 3 - assert header["edge_count"] == 2 - assert header["file_count"] == 2 - assert header["format_version"] == 1 - assert header["codelens_version"] # non-empty - - # ── Import into workspace B (fresh, empty) ── - import_result = cmd_import_snapshot( - workspace_b, input_path=snapshot_path, merge=False - ) - assert import_result["status"] == "ok", import_result - # Same-version import should produce no warnings. - assert import_result["warnings"] == [] - assert import_result["mode"] == "replace" - - db_a = os.path.join(workspace_a, ".codelens", "codelens.db") - db_b = os.path.join(workspace_b, ".codelens", "codelens.db") - assert os.path.exists(db_b), "import did not create a database" - - # ── Every exported table must round-trip identically ── - for table in SNAPSHOT_TABLES: - rows_a = _table_rows(db_a, table) - rows_b = _table_rows(db_b, table) - assert rows_a == rows_b, ( - f"Table '{table}' differs after round-trip:\n" - f" source rows: {rows_a}\n" - f" imported rows: {rows_b}" - ) - - def test_export_message_format(self, workspace_a): - """Export message must match 'Snapshot exported: ()'.""" - from commands.export_snapshot import cmd_export_snapshot - - result = cmd_export_snapshot(workspace_a) - assert result["status"] == "ok" - msg = result["message"] - assert msg.startswith("Snapshot exported: "), msg - # Default display path is workspace-relative. - assert ".codelens/snapshot.codelens.gz" in msg, msg - # Size appears in parentheses with a unit suffix. - assert " (" in msg and msg.rstrip().endswith(")"), msg - assert any(unit in msg for unit in (" B", " KB", " MB", " GB")), msg - # size_human must match the parenthesized portion. - assert f"({result['size_human']})" in msg, msg - - def test_graph_schema_command_matches_after_round_trip( - self, workspace_a, workspace_b - ): - """The graph-schema command must report identical stats after import.""" - from commands.export_snapshot import cmd_export_snapshot - from commands.import_snapshot import cmd_import_snapshot - from commands.graph_schema import get_graph_schema - - snapshot_path = os.path.join(workspace_a, ".codelens", - DEFAULT_SNAPSHOT_FILENAME) - assert cmd_export_snapshot(workspace_a)["status"] == "ok" - assert cmd_import_snapshot( - workspace_b, input_path=snapshot_path - )["status"] == "ok" - - schema_a = get_graph_schema(workspace_a) - schema_b = get_graph_schema(workspace_b) - # Compare the queryable graph shape (ignore workspace path). - for key in ("nodes", "edges", "node_types", "edge_types", "indexes"): - assert schema_a[key] == schema_b[key], ( - f"graph-schema '{key}' differs: {schema_a[key]} vs {schema_b[key]}" - ) - - -# ─── Merge mode ─────────────────────────────────────────────── - - -class TestMergeMode: - """--merge deduplicates nodes/edges by their natural key.""" - - def test_import_twice_replace_doubles_then_merge_noop(self, workspace_a, workspace_b): - """Replace import reproduces source; a second merge import adds nothing.""" - from commands.export_snapshot import cmd_export_snapshot - from commands.import_snapshot import cmd_import_snapshot - - snapshot_path = os.path.join(workspace_a, ".codelens", - DEFAULT_SNAPSHOT_FILENAME) - assert cmd_export_snapshot(workspace_a)["status"] == "ok" - - # First import (replace) into empty B. - r1 = cmd_import_snapshot(workspace_b, input_path=snapshot_path, merge=False) - assert r1["status"] == "ok" - assert r1["total_inserted"] > 0 - assert r1["total_skipped"] == 0 - - db_b = os.path.join(workspace_b, ".codelens", "codelens.db") - - # Row counts after first import. - counts_after_first = { - t: len(_table_rows(db_b, t)) for t in SNAPSHOT_TABLES - } - - # Second import with --merge: every row's natural key already - # exists, so all rows must be skipped (0 inserted). - r2 = cmd_import_snapshot(workspace_b, input_path=snapshot_path, merge=True) - assert r2["status"] == "ok" - assert r2["mode"] == "merge" - assert r2["total_inserted"] == 0, ( - f"merge re-import should insert 0 rows, got {r2['total_inserted']}" - ) - assert r2["total_skipped"] > 0 - - # Row counts must be unchanged after the merge re-import. - counts_after_second = { - t: len(_table_rows(db_b, t)) for t in SNAPSHOT_TABLES - } - assert counts_after_first == counts_after_second, ( - f"merge re-import changed row counts: " - f"{counts_after_first} -> {counts_after_second}" - ) - - def test_merge_combines_disjoint_graphs(self, workspace_a, workspace_b): - """Merging a snapshot with extra nodes adds only the new ones.""" - from commands.export_snapshot import cmd_export_snapshot - from commands.import_snapshot import cmd_import_snapshot - - snapshot_path = os.path.join(workspace_a, ".codelens", - DEFAULT_SNAPSHOT_FILENAME) - assert cmd_export_snapshot(workspace_a)["status"] == "ok" - - # Seed workspace B with ONE node that is NOT in the snapshot. - from persistent_registry import PersistentRegistry - reg = PersistentRegistry(workspace_b) - reg._connect() - reg.close() - db_b = os.path.join(workspace_b, ".codelens", "codelens.db") - conn = sqlite3.connect(db_b) - try: - conn.execute( - "INSERT INTO graph_nodes " - "(node_id, node_type, name, file, line, extra_json) " - "VALUES (?, ?, ?, ?, ?, ?)", - ("src/extra.py:1:only_in_b", "function", "only_in_b", - "src/extra.py", 1, None), - ) - conn.commit() - finally: - conn.close() - - # Merge-import the snapshot — the 3 snapshot nodes are new, - # so they're inserted; the existing B-only node survives. - r = cmd_import_snapshot(workspace_b, input_path=snapshot_path, merge=True) - assert r["status"] == "ok" - assert r["mode"] == "merge" - # All 3 graph_nodes from the snapshot are new relative to B. - assert r["inserted"]["graph_nodes"] == 3 - assert r["skipped"]["graph_nodes"] == 0 - - # Final node count = 1 (pre-existing) + 3 (merged in) = 4. - final_nodes = len(_table_rows(db_b, "graph_nodes")) - assert final_nodes == 4, f"expected 4 nodes after merge, got {final_nodes}" - - -# ─── Validation ─────────────────────────────────────────────── - - -class TestValidation: - """Version-mismatch warnings and error handling.""" - - def test_version_mismatch_warns(self, workspace_a, workspace_b): - """A snapshot with a different codelens_version must warn on import.""" - from commands.export_snapshot import cmd_export_snapshot - from commands.import_snapshot import cmd_import_snapshot - from snapshot_io import read_snapshot, write_snapshot - - snapshot_path = os.path.join(workspace_a, ".codelens", - DEFAULT_SNAPSHOT_FILENAME) - assert cmd_export_snapshot(workspace_a)["status"] == "ok" - - # Tamper with the version to simulate a cross-version import. - snap = read_snapshot(snapshot_path) - snap["header"]["codelens_version"] = "0.0.0-mock" - write_snapshot(snap, snapshot_path) - - r = cmd_import_snapshot( - workspace_b, input_path=snapshot_path, merge=False - ) - assert r["status"] == "ok" - assert any("0.0.0-mock" in w for w in r["warnings"]), r["warnings"] - assert any("different" in w.lower() or "version" in w.lower() - for w in r["warnings"]), r["warnings"] - - def test_import_missing_snapshot_returns_error(self, workspace_b): - """Importing a non-existent snapshot must return a clean error.""" - from commands.import_snapshot import cmd_import_snapshot - - bogus = os.path.join(workspace_b, ".codelens", "nope.codelens.gz") - r = cmd_import_snapshot(workspace_b, input_path=bogus) - assert r["status"] == "error" - assert "not found" in r["error"].lower() or "nope" in r["error"] - - def test_export_missing_db_returns_error(self, workspace_b): - """Exporting with no database present must return a clean error.""" - from commands.export_snapshot import cmd_export_snapshot - - r = cmd_export_snapshot(workspace_b) - assert r["status"] == "error" - assert "not found" in r["error"].lower() or "scan" in r["error"].lower() - - -# ─── Constraint: no file content ────────────────────────────── - - -class TestNoFileContent: - """Issue #12 constraint: the snapshot must NOT contain file content.""" - - def test_snapshot_contains_no_content_blobs(self, workspace_a): - """No exported column may hold raw source content. - - The ``files`` table holds ``content_hash`` (a digest), not bytes. - ``symbols`` holds ``signature`` (a parsed signature string), not - source. This test asserts the snapshot's data shape stays free - of any obvious content-bearing column by checking the exported - column names against a denylist. - """ - from commands.export_snapshot import cmd_export_snapshot - from snapshot_io import read_snapshot, default_snapshot_path - - result = cmd_export_snapshot(workspace_a) - assert result["status"] == "ok" - - snap = read_snapshot(default_snapshot_path(workspace_a)) - data = snap["data"] - - # Forbidden column names — if any appear, file content is leaking - # into the snapshot and the issue #12 constraint is violated. - forbidden = {"content", "source", "body", "text", "raw", "bytes", "code"} - for table, tbl in data.items(): - cols = set(tbl.get("columns", [])) - leaked = cols & forbidden - assert not leaked, ( - f"Table '{table}' exports content-bearing columns {leaked} " - f"— issue #12 forbids file content in snapshots." - ) - - # Spot-check: the files table must export content_hash, not content. - assert "content_hash" in data["files"]["columns"] - assert "content" not in data["files"]["columns"] - # And no row value may be a multi-KB blob (sanity cap). - for table in SNAPSHOT_TABLES: - for row in data[table]["rows"]: - for val in row: - if isinstance(val, str): - assert len(val) < 8192, ( - f"Suspiciously large string value ({len(val)} chars) " - f"in table '{table}' — possible content leak." - ) - - -# ─── format_size helper ─────────────────────────────────────── - - -class TestFormatSize: - def test_bytes(self): - assert format_size(512) == "512 B" - - def test_kilobytes(self): - assert format_size(1500) == "1.5 KB" - - def test_megabytes(self): - assert format_size(1250000) == "1.2 MB" - - def test_zero(self): - assert format_size(0) == "0 B" - - -if __name__ == "__main__": - pytest.main([__file__, "-v", "--tb=short"])