feat: edit syntax-health checks + anchored str_replace + opt-in linter bridge#517
feat: edit syntax-health checks + anchored str_replace + opt-in linter bridge#517justrach wants to merge 14 commits into
Conversation
…health) codedb_edit was a blind line-splice that returned "edit applied" even when the result no longer parsed. The deep-SWE benchmark showed Sonnet-4.6 regenerates multi-line bracketed regions (import blocks, function signatures) and leaves an orphaned/duplicated delimiter, shipping files that don't import — codedb caught none of it. Add a dependency-free, language-aware delimiter-balance scan (scanDelimiterBalance / describeHealth) wired into EditResult.health and surfaced by handleEdit, so a mis-spliced edit is reported back to the agent. Advisory only; skips non-code languages and bails when unsure. On the real broken benchmark patches: 4/4 delimiter-class breaks flagged, 0 false positives on 11 good files. Tests: src/test_core.zig "edit-health: ..." (incl. faithful httpx/narwhals repros). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The other half of the benchmark breakage: codedb's narwhals patch regenerated a `from ... import (...)` block and dropped names (ExprNode/ExprKind/unstable) that were still referenced — a NameError at import. The file stays syntactically valid, so the delimiter scan cannot see it. describeHealth now also diffs Python imports before vs after the edit and flags any binding that was removed yet is still used (whole-word, skipping import/comment lines; names re-imported elsewhere are not flagged). On the real codedb narwhals series.py it reports: "'ExprNode' was removed from the imports but is still used at line 109". Combined with the delimiter scan, both of codedb's broken benchmark patches (httpx, narwhals) now surface a specific, actionable warning. Tests: src/test_core.zig — dropped-but-used flagged, unused-drop clean, re-import not flagged. zig build test green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The benchmark breakages all came from range-based whole-block replaces that mis-targeted their boundaries. Add an anchored op: old_string/new_string does an exact byte splice of the unique occurrence (errors PatternNotFound / PatternNotUnique otherwise), so an edit cannot silently swallow surrounding lines. Factor applyEdit's write/record tail into finalizeEdit, shared by the line ops and the new anchored path; the post-edit health scan runs on both. handleEdit accepts op="str_replace" with old_string/new_string, and the tool schema advertises it as the preferred edit with the health-warning contract. Tests: src/test_core.zig "edit-str_replace: ..." (exact update, not-found, non-unique, health-on-anchor-break). zig build test green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ation) Foundation for the external-linter bridge: the in-process heuristics (Tier-0) are the instant always-on floor; a fast external linter is the optional rich ceiling that catches real diagnostics (undefined names, type/lint errors) the heuristics can't. This commit lands the pure, side-effect-free decision layer. src/linter.zig: - linterFor(language): registry of the preferred single-file linter per language. ruff (`ruff check <f> --output-format json`) and biome (`biome lint --reporter=json <f>`) CLIs confirmed via deepwiki; zig ast-check / gofmt -e / ruby -c / php -l use the toolchain's own check. Languages without a good single-file checker return null and use the heuristics. - LinterSession: per-connection availability cache. A language that is missing, fails to install, or crashes is marked .unavailable and never retried this session, so a flaky/absent tool can never tax the edit path — it silently falls back to Tier-0. - toolOnPath(): libc access(2) PATH probe (std.fs/std.posix absent in this Io build). Next (not in this commit): off-hot-path async execution, optional auto-install (ruff/biome/…), and wiring diagnostics into the edit result / a codedb_diagnostics tool. Tests: src/test_core.zig "linter: ..." (registry, sticky fallback, PATH probe). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Refine the install model per design discussion: codedb never installs linters or prompts at edit time. Instead the opt-in happens at codedb install / `codedb update` time and the choice is remembered; the MCP server only reads it. LinterSession gains an `enabled` flag (default false) and shouldTry() now requires it — so with no opt-in (or a decline) the edit path uses only the Tier-0 heuristics and carries zero linter cost. Install/update-time prompt + persistence + actual install wiring are the next step. Tests: src/test_core.zig — disabled session never tries; opted-in session respects the sticky per-language fallback. zig build test green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
New src/linter_pref.zig holds the remembered linter opt-in (Pref: unset/off/on) as a global user-level state file next to the auto-update stamp — NOT in per-checkout .codedbrc. read()/write() use the same std.Io idioms as the auto-update stamp; readAt()/writeAt()/parseBody() are path-injectable + pure so tests never touch the real ~/.codedb. Absent/garbage -> unset -> external linters off (Tier-0 heuristics). The MCP server will read() this at startup to seed LinterSession.enabled; the install / `codedb update` flow write()s it. Tests: src/test_core.zig "linter-pref: ..." (parse states, write/read round-trip, missing-file-is-unset, unset-write-is-noop). zig build test green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
runCheck(allocator, language, abs_path) spawns the registered linter via
cio.runCapture (substituting FILE_TOKEN), classifies the exit Term (non-.Exited =>
LinterCrashed so the session disables it), and folds output into a short owned
summary or null when clean. JSON tools parse via std.json: summarizeRuffJson (array
of {code,message,location.row}) and summarizeBiomeJson ({diagnostics:[{category}]})
are pure + public for testing; exit-code tools (zig ast-check/gofmt/ruby/php) use
code + first stderr line. installFor() returns the install argv for the two
installable tools (uv tool install ruff; npm i -g @biomejs/biome), null otherwise.
Must run off the hot path (spawns a process). 256KB output cap.
Tests: src/test_core.zig — installFor mapping, ruff/biome summary + clean cases,
runCheck NoLinter. zig build test green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…time) Add the consent flow that records the linter preference. cio.readLine() reads one stdin line (CLI only). linter.maybePromptAndInstall() — called from update.zig run() after a successful update and on already-up-to-date — is a no-op unless BOTH stdin and stdout are TTYs (never fires in the MCP server, CI, or curl|bash) and the pref is still unset (asked at most once). On yes it installs ruff/biome if missing (via their package managers, skipping absent ones) and writes pref=on; on no it writes pref=off. The MCP server never prompts — it only reads the remembered pref. Also fixes a latent break: cio.readLine used std.mem.trimRight (renamed to trimEnd in this build). It was invisible to `zig build test` because maybePromptAndInstall is reached only from the exe, so Zig's lazy analysis skipped it — only the CLI build (exercised by issue-150) caught it. Added a test that references the entrypoint so the prompt path is analysed by `zig build test` too. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…results The delivery substrate: holds the latest linter summary per file so the edit/read handlers can piggyback it and a codedb_diagnostics tool can pull it. Mutex-guarded (producer is a detached worker, consumers are request handlers), bounded to MAX=16 with LRU eviction by timestamp, FRESH_MS staleness window. tryBeginWork() coalesces per path and skips already-fresh content; an `inflight` counter + drain() let the owner wait for workers before deinit (no use-after-free on connection teardown). appendIfFresh() matches (path,hash) for piggyback; appendLatest() serves the pull tool. (std.Thread.sleep absent in this build -> libc usleep for the drain wait.) Tests: src/test_core.zig "diag-cache: ..." (hash-matched piggyback, latest, coalesce semantics, bounded+leak-free eviction via testing.allocator). zig build test green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…_diagnostics) Connects the Tier-1 pieces into the live MCP server: - ProjectCache (per-connection, created once in run()) now holds a LinterSession seeded from the persisted preference (linter_pref.read at startup) and a c_allocator-backed DiagnosticsCache; deinit() drains in-flight workers before freeing, so a detached worker can never touch freed state. - After a successful non-dry-run codedb_edit, if the user opted in and the language has an available linter, a detached LintJob runs runCheck() OFF the hot path (its own c_allocator + an owned path copy), then store()s the summary or marks the language unavailable (sticky) on failure. tryBeginWork() coalesces per path. handleEdit also piggybacks any cached summary for the new content. - New codedb_diagnostics tool (Tool enum + tools_list + dispatch + handler) pulls the latest summary for a path. LinterSession.statuses is now mutex-guarded (worker writes mark(); main thread reads shouldTry()). The edit hot path stays heuristics-only + a fire-and-forget spawn; the linter result is delivered out of band. With no opt-in it's fully dormant. Verified: zig build + zig build test green; MCP stdio smoke shows the tool advertised, edit health inline, the worker hitting the missing-tool fallback without crashing, codedb_diagnostics responding, and clean shutdown. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…talls At the `codedb update` opt-in (TTY-gated, asked once), if the user opts into linters: prefer nanobrew (justrach's fast Zig package manager — installs Homebrew bottles in ~1s) when present; on macOS/Linux, if nb is absent, ASK once whether to install it (curl installer) with a one-line reason; otherwise fall back to the per-tool installers (uv for ruff, npm for biome). Never auto-installs nanobrew silently — it's a prompt. nanobrewPath() finds nb on PATH or at the canonical /opt/nanobrew/prefix/bin (its installer only PATHs that for new shells). When nanobrew is used it installs ruff + biome in one shot (`nb install ruff biome`). Validated live: nb install ruff biome -> ruff 0.15.12 + biome 2.4.16 in ~2.9s. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…st when off) Address the latency concern: when the user hasn't opted into linters (the default), handleEdit now does nothing extra after a write — the whole Tier-1 block (cache appendIfFresh, detectLanguage, shouldTry, thread spawn) is skipped via the read-only `enabled` flag. When on, the linter still runs on a detached thread after the edit response is built, so it never adds latency to the edit itself. The MCP server never installs anything — installation happens only at `codedb update` time; at lint time the worker only CALLS the linter (and marks it unavailable for the session if absent). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
linterFor() now covers C/C++ (cppcheck), shell (shellcheck, JSON), Kotlin (ktlint), and Swift (swiftlint) on top of the existing python/js/ts/zig/go/ruby/php. Zig stays `zig ast-check` (ships with the compiler). shellcheck gets a JSON summarizer (summarizeShellcheckJson -> "shellcheck N issues - SCxxxx <msg> at L<n>"); the others use the exit-code path. Fix: exit-code summaries now strip ANSI escapes (appendFirstLineStripped) — `zig ast-check` colours its output even to a pipe, which otherwise put raw ESC bytes into the JSON-RPC string. The install set grows to ruff+biome+shellcheck+cppcheck (via nanobrew one-shot, or per-tool brew/uv/npm fallback); ktlint (pulls a JDK) and swiftlint (niche) are registry-only — used if present, not auto-installed. Tests: registry mapping, installFor mapping, summarizeShellcheckJson, ANSI strip. zig build + zig build test green. Live MCP smoke confirmed: editing a .zig file -> "zig check failed - ...:2:16: error: expected ';'"; a .sh file -> shellcheck SC2154/SC2086. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f19716e209
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (cio.readLine(&b2)) |l2| { | ||
| if (l2.len > 0 and (l2[0] == 'y' or l2[0] == 'Y')) { | ||
| out.print(" installing nanobrew...\n", .{}) catch {}; | ||
| if (cio.runCapture(.{ .allocator = allocator, .argv = &.{ "sh", "-c", "curl -fsSL https://nanobrew.trilok.ai/install | bash" }, .max_output_bytes = 1024 * 1024 })) |res| { |
There was a problem hiding this comment.
Avoid piping the downloaded installer into bash
When a TTY user opts into nanobrew here, codedb update executes whatever bytes are returned by https://nanobrew.trilok.ai/install directly via sh -c ... | bash, with no pinned version, checksum, signature, or reviewed script contents. This is in the installer/update path and violates the repo guideline from AGENTS.md to flag installer scripts that execute untrusted code or skip verification; a transient CDN/DNS/TLS endpoint compromise would become arbitrary local code execution for users who answer yes.
Useful? React with 👍 / 👎.
| cache.diag.endWork(path); | ||
| return; | ||
| }; | ||
| t.detach(); |
There was a problem hiding this comment.
Join linter workers before freeing the session cache
With external linters enabled, spawnLintWorker detaches a thread that keeps a pointer to the stack-owned ProjectCache; however ProjectCache.deinit() only waits via DiagnosticsCache.drain() for about 2 seconds before freeing the cache and returning. If a linter process is slow or hangs and the MCP connection shuts down during that window, the detached worker can later execute job.cache.linter.mark() or job.cache.diag.store()/endWork() against freed stack/cache memory, causing a use-after-free or crash. The worker needs to be joined, cancelled, or otherwise prevented from outliving ProjectCache.
Useful? React with 👍 / 👎.
# Conflicts: # src/edit.zig
Benchmark Regression ReportThresholds: 10.00% and 50,000 ns absolute delta
|
What
Three layers that make
codedb_editsafer, plus an opt-in real-linter bridge. Everything is advisory and off the edit hot path; the external linters are off unless you opt in atcodedb update, so default behavior + latency are unchanged.1. Post-edit syntax health (in-process, always on, microseconds)
After every edit, codedb scans the result and warns if the edit broke the file:
)from regenerating a multi-line import block),NameErrorwaiting to happen).2. Anchored
str_replacecodedb_editgainsop=str_replace(old_string/new_string) — an exact, unique byte splice that can't mis-target surrounding lines the way a range replace can. Advertised as the preferred edit.3. External-linter bridge (opt-in, off the hot path)
When enabled, codedb runs a real linter on the edited file on a detached thread (never blocking the edit) and serves results via a new
codedb_diagnosticstool (+ piggyback on the next edit). Registry: ruff (Python), biome (JS/TS/CSS),zig ast-check, gofmt (Go),ruby -c,php -l, cppcheck (C/C++), shellcheck (shell), ktlint (Kotlin), swiftlint (Swift). Opt-in is asked once atcodedb update(TTY only; the MCP server never prompts/installs), prefers nanobrew when present, and falls back to uv/npm/brew. A missing/failed linter is disabled for the session, so it silently falls back to the in-process checks.Why
Functional validation of the deep-SWE benchmark found the codedb-driven agent shipped patches that don't even import (httpx: a duplicate
); narwhals: a dropped-but-used import producing aNameError). Layer 1 catches exactly that class at edit time instead of at runtime.Safety / latency
enabled).Testing
zig build testgreen. Live MCP smoke verified end-to-end on macOS (ruff F401/F821,zig ast-check, shellcheck SC2154/SC2086) and Linux (cross-compiled staticaarch64-linux-musl, run under Applecontainer), including the missing-tool fallback.Needs reconciliation with current
mainThis branch was cut from
mainbefore 0.2.5822. Since thenmainrewroteapplyEdit(path resolution viaexplorer.root_dir, two fast-path replace blocks, a newEditResult.changedfield) and grewmcp.zigsubstantially. Rebasing onto currentmainproduces a 2-hunk conflict insrc/edit.zigonly —mcp.zig/update.zigauto-merge. Reconciliation plan: re-applystr_replace+ the health check ontomain'sapplyEdit, route all write paths through onefinalizeEdit, and keepmain's.changed/ fast-path optimizations. Happy to push that rebase on request.Commits
13 focused commits: P0a delimiter scan, P0b dropped-import scan, P2 anchored str_replace, linter registry/preference/exec/prompt, DiagnosticsCache, daemon wiring +
codedb_diagnostics, nanobrew install option, and shellcheck/cppcheck/ktlint/swiftlint + ANSI-strip.🤖 Generated with Claude Code