feat: edit syntax-health checks + anchored str_replace + opt-in linter bridge by justrach · Pull Request #517 · justrach/codedb

justrach · 2026-05-31T09:05:10Z

What

Three layers that make codedb_edit safer, plus an opt-in real-linter bridge. Everything is advisory and off the edit hot path; the external linters are off unless you opt in at codedb update, so default behavior + latency are unchanged.

1. Post-edit syntax health (in-process, always on, microseconds)
After every edit, codedb scans the result and warns if the edit broke the file:

unbalanced/duplicated delimiter (e.g. a stray ) from regenerating a multi-line import block),
an import name removed while still referenced (a NameError waiting to happen).

2. Anchored str_replace
codedb_edit gains op=str_replace (old_string/new_string) — an exact, unique byte splice that can't mis-target surrounding lines the way a range replace can. Advertised as the preferred edit.

3. External-linter bridge (opt-in, off the hot path)
When enabled, codedb runs a real linter on the edited file on a detached thread (never blocking the edit) and serves results via a new codedb_diagnostics tool (+ piggyback on the next edit). Registry: ruff (Python), biome (JS/TS/CSS), zig ast-check, gofmt (Go), ruby -c, php -l, cppcheck (C/C++), shellcheck (shell), ktlint (Kotlin), swiftlint (Swift). Opt-in is asked once at codedb update (TTY only; the MCP server never prompts/installs), prefers nanobrew when present, and falls back to uv/npm/brew. A missing/failed linter is disabled for the session, so it silently falls back to the in-process checks.

Why

Functional validation of the deep-SWE benchmark found the codedb-driven agent shipped patches that don't even import (httpx: a duplicate ); narwhals: a dropped-but-used import producing a NameError). Layer 1 catches exactly that class at edit time instead of at runtime.

Safety / latency

Off by default; when off, the edit path does zero extra work (the Tier-1 block is gated behind enabled).
When on, the linter runs on a detached thread after the edit response is built — no added edit latency. A bounded diagnostics cache drains in-flight workers before teardown (no use-after-free).
The always-on in-process check is a single ~microsecond byte scan.

Testing

zig build test green. Live MCP smoke verified end-to-end on macOS (ruff F401/F821, zig ast-check, shellcheck SC2154/SC2086) and Linux (cross-compiled static aarch64-linux-musl, run under Apple container), including the missing-tool fallback.

Needs reconciliation with current `main`

This branch was cut from main before 0.2.5822. Since then main rewrote applyEdit (path resolution via explorer.root_dir, two fast-path replace blocks, a new EditResult.changed field) and grew mcp.zig substantially. Rebasing onto current main produces a 2-hunk conflict in src/edit.zig only — mcp.zig/update.zig auto-merge. Reconciliation plan: re-apply str_replace + the health check onto main's applyEdit, route all write paths through one finalizeEdit, and keep main's .changed / fast-path optimizations. Happy to push that rebase on request.

Commits

13 focused commits: P0a delimiter scan, P0b dropped-import scan, P2 anchored str_replace, linter registry/preference/exec/prompt, DiagnosticsCache, daemon wiring + codedb_diagnostics, nanobrew install option, and shellcheck/cppcheck/ktlint/swiftlint + ANSI-strip.

🤖 Generated with Claude Code

…health) codedb_edit was a blind line-splice that returned "edit applied" even when the result no longer parsed. The deep-SWE benchmark showed Sonnet-4.6 regenerates multi-line bracketed regions (import blocks, function signatures) and leaves an orphaned/duplicated delimiter, shipping files that don't import — codedb caught none of it. Add a dependency-free, language-aware delimiter-balance scan (scanDelimiterBalance / describeHealth) wired into EditResult.health and surfaced by handleEdit, so a mis-spliced edit is reported back to the agent. Advisory only; skips non-code languages and bails when unsure. On the real broken benchmark patches: 4/4 delimiter-class breaks flagged, 0 false positives on 11 good files. Tests: src/test_core.zig "edit-health: ..." (incl. faithful httpx/narwhals repros). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The other half of the benchmark breakage: codedb's narwhals patch regenerated a `from ... import (...)` block and dropped names (ExprNode/ExprKind/unstable) that were still referenced — a NameError at import. The file stays syntactically valid, so the delimiter scan cannot see it. describeHealth now also diffs Python imports before vs after the edit and flags any binding that was removed yet is still used (whole-word, skipping import/comment lines; names re-imported elsewhere are not flagged). On the real codedb narwhals series.py it reports: "'ExprNode' was removed from the imports but is still used at line 109". Combined with the delimiter scan, both of codedb's broken benchmark patches (httpx, narwhals) now surface a specific, actionable warning. Tests: src/test_core.zig — dropped-but-used flagged, unused-drop clean, re-import not flagged. zig build test green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The benchmark breakages all came from range-based whole-block replaces that mis-targeted their boundaries. Add an anchored op: old_string/new_string does an exact byte splice of the unique occurrence (errors PatternNotFound / PatternNotUnique otherwise), so an edit cannot silently swallow surrounding lines. Factor applyEdit's write/record tail into finalizeEdit, shared by the line ops and the new anchored path; the post-edit health scan runs on both. handleEdit accepts op="str_replace" with old_string/new_string, and the tool schema advertises it as the preferred edit with the health-warning contract. Tests: src/test_core.zig "edit-str_replace: ..." (exact update, not-found, non-unique, health-on-anchor-break). zig build test green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ation) Foundation for the external-linter bridge: the in-process heuristics (Tier-0) are the instant always-on floor; a fast external linter is the optional rich ceiling that catches real diagnostics (undefined names, type/lint errors) the heuristics can't. This commit lands the pure, side-effect-free decision layer. src/linter.zig: - linterFor(language): registry of the preferred single-file linter per language. ruff (`ruff check <f> --output-format json`) and biome (`biome lint --reporter=json <f>`) CLIs confirmed via deepwiki; zig ast-check / gofmt -e / ruby -c / php -l use the toolchain's own check. Languages without a good single-file checker return null and use the heuristics. - LinterSession: per-connection availability cache. A language that is missing, fails to install, or crashes is marked .unavailable and never retried this session, so a flaky/absent tool can never tax the edit path — it silently falls back to Tier-0. - toolOnPath(): libc access(2) PATH probe (std.fs/std.posix absent in this Io build). Next (not in this commit): off-hot-path async execution, optional auto-install (ruff/biome/…), and wiring diagnostics into the edit result / a codedb_diagnostics tool. Tests: src/test_core.zig "linter: ..." (registry, sticky fallback, PATH probe). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Refine the install model per design discussion: codedb never installs linters or prompts at edit time. Instead the opt-in happens at codedb install / `codedb update` time and the choice is remembered; the MCP server only reads it. LinterSession gains an `enabled` flag (default false) and shouldTry() now requires it — so with no opt-in (or a decline) the edit path uses only the Tier-0 heuristics and carries zero linter cost. Install/update-time prompt + persistence + actual install wiring are the next step. Tests: src/test_core.zig — disabled session never tries; opted-in session respects the sticky per-language fallback. zig build test green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

New src/linter_pref.zig holds the remembered linter opt-in (Pref: unset/off/on) as a global user-level state file next to the auto-update stamp — NOT in per-checkout .codedbrc. read()/write() use the same std.Io idioms as the auto-update stamp; readAt()/writeAt()/parseBody() are path-injectable + pure so tests never touch the real ~/.codedb. Absent/garbage -> unset -> external linters off (Tier-0 heuristics). The MCP server will read() this at startup to seed LinterSession.enabled; the install / `codedb update` flow write()s it. Tests: src/test_core.zig "linter-pref: ..." (parse states, write/read round-trip, missing-file-is-unset, unset-write-is-noop). zig build test green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

runCheck(allocator, language, abs_path) spawns the registered linter via cio.runCapture (substituting FILE_TOKEN), classifies the exit Term (non-.Exited => LinterCrashed so the session disables it), and folds output into a short owned summary or null when clean. JSON tools parse via std.json: summarizeRuffJson (array of {code,message,location.row}) and summarizeBiomeJson ({diagnostics:[{category}]}) are pure + public for testing; exit-code tools (zig ast-check/gofmt/ruby/php) use code + first stderr line. installFor() returns the install argv for the two installable tools (uv tool install ruff; npm i -g @biomejs/biome), null otherwise. Must run off the hot path (spawns a process). 256KB output cap. Tests: src/test_core.zig — installFor mapping, ruff/biome summary + clean cases, runCheck NoLinter. zig build test green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…time) Add the consent flow that records the linter preference. cio.readLine() reads one stdin line (CLI only). linter.maybePromptAndInstall() — called from update.zig run() after a successful update and on already-up-to-date — is a no-op unless BOTH stdin and stdout are TTYs (never fires in the MCP server, CI, or curl|bash) and the pref is still unset (asked at most once). On yes it installs ruff/biome if missing (via their package managers, skipping absent ones) and writes pref=on; on no it writes pref=off. The MCP server never prompts — it only reads the remembered pref. Also fixes a latent break: cio.readLine used std.mem.trimRight (renamed to trimEnd in this build). It was invisible to `zig build test` because maybePromptAndInstall is reached only from the exe, so Zig's lazy analysis skipped it — only the CLI build (exercised by issue-150) caught it. Added a test that references the entrypoint so the prompt path is analysed by `zig build test` too. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…results The delivery substrate: holds the latest linter summary per file so the edit/read handlers can piggyback it and a codedb_diagnostics tool can pull it. Mutex-guarded (producer is a detached worker, consumers are request handlers), bounded to MAX=16 with LRU eviction by timestamp, FRESH_MS staleness window. tryBeginWork() coalesces per path and skips already-fresh content; an `inflight` counter + drain() let the owner wait for workers before deinit (no use-after-free on connection teardown). appendIfFresh() matches (path,hash) for piggyback; appendLatest() serves the pull tool. (std.Thread.sleep absent in this build -> libc usleep for the drain wait.) Tests: src/test_core.zig "diag-cache: ..." (hash-matched piggyback, latest, coalesce semantics, bounded+leak-free eviction via testing.allocator). zig build test green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…_diagnostics) Connects the Tier-1 pieces into the live MCP server: - ProjectCache (per-connection, created once in run()) now holds a LinterSession seeded from the persisted preference (linter_pref.read at startup) and a c_allocator-backed DiagnosticsCache; deinit() drains in-flight workers before freeing, so a detached worker can never touch freed state. - After a successful non-dry-run codedb_edit, if the user opted in and the language has an available linter, a detached LintJob runs runCheck() OFF the hot path (its own c_allocator + an owned path copy), then store()s the summary or marks the language unavailable (sticky) on failure. tryBeginWork() coalesces per path. handleEdit also piggybacks any cached summary for the new content. - New codedb_diagnostics tool (Tool enum + tools_list + dispatch + handler) pulls the latest summary for a path. LinterSession.statuses is now mutex-guarded (worker writes mark(); main thread reads shouldTry()). The edit hot path stays heuristics-only + a fire-and-forget spawn; the linter result is delivered out of band. With no opt-in it's fully dormant. Verified: zig build + zig build test green; MCP stdio smoke shows the tool advertised, edit health inline, the worker hitting the missing-tool fallback without crashing, codedb_diagnostics responding, and clean shutdown. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…talls At the `codedb update` opt-in (TTY-gated, asked once), if the user opts into linters: prefer nanobrew (justrach's fast Zig package manager — installs Homebrew bottles in ~1s) when present; on macOS/Linux, if nb is absent, ASK once whether to install it (curl installer) with a one-line reason; otherwise fall back to the per-tool installers (uv for ruff, npm for biome). Never auto-installs nanobrew silently — it's a prompt. nanobrewPath() finds nb on PATH or at the canonical /opt/nanobrew/prefix/bin (its installer only PATHs that for new shells). When nanobrew is used it installs ruff + biome in one shot (`nb install ruff biome`). Validated live: nb install ruff biome -> ruff 0.15.12 + biome 2.4.16 in ~2.9s. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…st when off) Address the latency concern: when the user hasn't opted into linters (the default), handleEdit now does nothing extra after a write — the whole Tier-1 block (cache appendIfFresh, detectLanguage, shouldTry, thread spawn) is skipped via the read-only `enabled` flag. When on, the linter still runs on a detached thread after the edit response is built, so it never adds latency to the edit itself. The MCP server never installs anything — installation happens only at `codedb update` time; at lint time the worker only CALLS the linter (and marks it unavailable for the session if absent). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

linterFor() now covers C/C++ (cppcheck), shell (shellcheck, JSON), Kotlin (ktlint), and Swift (swiftlint) on top of the existing python/js/ts/zig/go/ruby/php. Zig stays `zig ast-check` (ships with the compiler). shellcheck gets a JSON summarizer (summarizeShellcheckJson -> "shellcheck N issues - SCxxxx <msg> at L<n>"); the others use the exit-code path. Fix: exit-code summaries now strip ANSI escapes (appendFirstLineStripped) — `zig ast-check` colours its output even to a pipe, which otherwise put raw ESC bytes into the JSON-RPC string. The install set grows to ruff+biome+shellcheck+cppcheck (via nanobrew one-shot, or per-tool brew/uv/npm fallback); ktlint (pulls a JDK) and swiftlint (niche) are registry-only — used if present, not auto-installed. Tests: registry mapping, installFor mapping, summarizeShellcheckJson, ANSI strip. zig build + zig build test green. Live MCP smoke confirmed: editing a .zig file -> "zig check failed - ...:2:16: error: expected ';'"; a .sh file -> shellcheck SC2154/SC2086. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f19716e209

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-31T09:07:38Z

+        if (cio.readLine(&b2)) |l2| {
+            if (l2.len > 0 and (l2[0] == 'y' or l2[0] == 'Y')) {
+                out.print("  installing nanobrew...\n", .{}) catch {};
+                if (cio.runCapture(.{ .allocator = allocator, .argv = &.{ "sh", "-c", "curl -fsSL https://nanobrew.trilok.ai/install | bash" }, .max_output_bytes = 1024 * 1024 })) |res| {


Avoid piping the downloaded installer into bash

When a TTY user opts into nanobrew here, codedb update executes whatever bytes are returned by https://nanobrew.trilok.ai/install directly via sh -c ... | bash, with no pinned version, checksum, signature, or reviewed script contents. This is in the installer/update path and violates the repo guideline from AGENTS.md to flag installer scripts that execute untrusted code or skip verification; a transient CDN/DNS/TLS endpoint compromise would become arbitrary local code execution for users who answer yes.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-31T09:07:38Z

+        cache.diag.endWork(path);
+        return;
+    };
+    t.detach();


Join linter workers before freeing the session cache

With external linters enabled, spawnLintWorker detaches a thread that keeps a pointer to the stack-owned ProjectCache; however ProjectCache.deinit() only waits via DiagnosticsCache.drain() for about 2 seconds before freeing the cache and returning. If a linter process is slow or hangs and the MCP connection shuts down during that window, the detached worker can later execute job.cache.linter.mark() or job.cache.diag.store()/endWork() against freed stack/cache memory, causing a use-after-free or crash. The worker needs to be joined, cancelled, or otherwise prevented from outliving ProjectCache.

Useful? React with 👍 / 👎.

# Conflicts: # src/edit.zig

github-actions · 2026-05-31T09:38:34Z

Benchmark Regression Report

Thresholds: 10.00% and 50,000 ns absolute delta

NOISE means the percentage threshold was exceeded, but the absolute delta was too small to fail CI.

Tool	Base (ns)	Head (ns)	Delta	Abs Delta (ns)	Status
`codedb_bundle`	84416	83616	-0.95%	-800	OK
`codedb_changes`	10711	10448	-2.46%	-263	OK
`codedb_context`	939803	967008	+2.89%	+27205	OK
`codedb_deps`	243	215	-11.52%	-28	OK
`codedb_edit`	37149	39597	+6.59%	+2448	OK
`codedb_find`	9290	9340	+0.54%	+50	OK
`codedb_hot`	24241	26368	+8.77%	+2127	OK
`codedb_outline`	29913	29664	-0.83%	-249	OK
`codedb_read`	14426	17726	+22.88%	+3300	NOISE
`codedb_search`	21665	21551	-0.53%	-114	OK
`codedb_snapshot`	51484	59043	+14.68%	+7559	NOISE
`codedb_status`	8792	9237	+5.06%	+445	OK
`codedb_symbol`	18970	17727	-6.55%	-1243	OK
`codedb_tree`	33689	30053	-10.79%	-3636	OK
`codedb_word`	13087	12888	-1.52%	-199	OK

justrach and others added 13 commits May 31, 2026 13:06

chatgpt-codex-connector Bot reviewed May 31, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into trial/graph-based-codedb

2f8e535

# Conflicts: # src/edit.zig

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: edit syntax-health checks + anchored str_replace + opt-in linter bridge#517

feat: edit syntax-health checks + anchored str_replace + opt-in linter bridge#517
justrach wants to merge 14 commits into
mainfrom
feat/edit-syntax-health-and-linter-bridge

justrach commented May 31, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 31, 2026

Uh oh!

chatgpt-codex-connector Bot May 31, 2026

Uh oh!

github-actions Bot commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

justrach commented May 31, 2026

What

Why

Safety / latency

Testing

Needs reconciliation with current main

Commits

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 31, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 31, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 31, 2026

Benchmark Regression Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Needs reconciliation with current `main`