Skip to content

Review fixes#1

Merged
dropdevrahul merged 4 commits into
mainfrom
review-fixes
Jun 18, 2026
Merged

Review fixes#1
dropdevrahul merged 4 commits into
mainfrom
review-fixes

Conversation

@dropdevrahul

Copy link
Copy Markdown
Owner

No description provided.

… review findings

Task-handling (model dispatch) overhaul:
- Add agents/nexum-impl-opus.md so needs-strong step content is delegated to an
  Opus-tier executor instead of forcing the whole orchestrator onto Opus.
- Batch steps by tier into one warm executor dispatch (dispatch_granularity:
  group) instead of one cold start per step; executors self-run guardrail.py and
  return its verdict, so the orchestrator skips a round-trip.
- Gate the reviewer to escalation / needs-strong / many-file steps; retry ladder
  is one same-tier patch attempt then escalate. Orchestration no longer assumes
  Opus. New config: dispatch_granularity, max_same_tier_retries.

Metered cost + cache-aware savings:
- Snapshot Claude Code's own cost.total_cost_usd into a session_cost table; the
  cost report prints this cache-accurate total beside the per-tier breakdown.
- Weight dedup savings by dedup_cache_weight (repeated reads bill at cache-read
  rate); record raw + effective tokens.
- test_determinism.py / test_metering.py.

Review-finding fixes:
- truncate.py: drop unreachable hard-cut else branch.
- audit.py: drop condition subsumed by the "# nexum" prefix check.
- store.py: corruption fallback uses a shared-cache in-memory DB (held open by a
  module keeper) so separate db() calls share state instead of silently becoming
  no-ops; drop the no-op foreign_keys pragma.
- dedup.py: measure pointer-collapse savings from the recorded shrunk token count
  (what the model actually saw), not the original output.
- scan_guard.py: fix _under_deny normalization (lstrip("./") mangled dot-leading
  paths like .git); remove the now-redundant raw fallback and dead FLAGS_WITH_VALUE.
- context_watch.py: derive task type over sorted words so the intent-guard
  decision is stable across PYTHONHASHSEED; collapse identical if/else branch.
- hooks.json: drop redundant truncate.py PostToolUse hook (dedup re-applies shrink).
…levers

- Handoff: auto-skeleton writer (scripts/handoff.py) + /nx-save and /nx-load
  commands; context_watch auto-writes a resume handoff past the threshold.
- Rename commands nexum-* -> nx-* (audit/build/plan/status) and add nx-save/nx-load.
- scan_guard: shlex-based tokenizer so quoted args don't evade the grep guard;
  PreToolUse Read-guard injects limit/offset for large files.
- dedup: gate truncate/dedup savings behind a per-session self-test, since
  PostToolUse updatedToolOutput is ignored for built-in tools on current CC.
- statusline: capture real metered cost/context size.
- README/SPEC: honest description of which context levers work today.
- Scratch notes: HANDOFF-hook-investigation.md, nexum-review.md.
The auto-handoff never fired in practice. context_watch drove the threshold
off max(prompt-text estimate, real_context_tokens flag): the estimate only
counts the typed prompt (never reaches 100k), and the flag is written by the
statusLine hook — which runs the cache copy lacking that write and can resolve
a different data dir. So token_total stayed tiny and no handoff was written.

- context_watch: read the REAL context size directly from the session
  transcript (input + cache_creation + cache_read of the last usage block) via
  new store.context_tokens_from_transcript; fall back to the estimate/flag only
  when the transcript is unavailable. Removes the statusline/data-dir/flag chain.
- Re-arm the handoff/compaction nudges when context drops back below the
  threshold (e.g. after /clear), instead of warning once per session forever.
- Handoff message now explicitly says to run /clear (or a fresh session) then
  /nx-load.
- handoff.write_skeleton + /nx-save + /nx-load resolve a project-scoped data
  dir (store.project_data_dir: $CLAUDE_PLUGIN_DATA, else git-root/.nexum-data,
  else cwd/.nexum-data) so writer and reader always agree per-project.
- Tests for the transcript reader, project_data_dir, transcript-driven handoff,
  and re-arm. Full suite: 269 passed.
Three optimizations folded into 0.3.0:

- predup.py (PreToolUse): deny an identical repeated Read/Grep/Glob (and
  optional read-only Bash) call already seen this session, with an mtime
  guard for Read. A PreToolUse deny is honored (unlike the inert PostToolUse
  shrink), so the avoided re-injection is recorded ungated and the status-line
  "saved" figure finally moves. Backed by a new input-keyed tool_calls table
  in store.py, populated by dedup.py on first occurrence.

- plan_preview.py: /nx-build prints a projected per-tier cost vs all-opus
  baseline before dispatching, so routing savings are visible up front.

- resume_nudge.py (SessionStart): one-line "run /nx-load" hint when a fresh
  handoff matches the current branch; nothing auto-loads.

Wire both new hooks in hooks.json, document in README/CHANGELOG, and sync
marketplace.json to 0.3.0 (was 0.2.1, failing check_version). Full suite green.
@dropdevrahul dropdevrahul merged commit 03f0346 into main Jun 18, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant