feat(review): kb.triage_pending — advisory triage scoring for the pending queue#345
feat(review): kb.triage_pending — advisory triage scoring for the pending queue#345jsdevninja wants to merge 1 commit into
Conversation
…ding queue a long `kb.list_pending` forces the reviewer to reconstruct, per proposal, whether the claim fits the existing kb, whether its citations resolve, whether it duplicates something already filed, and whether it contradicts an approved claim. this adds an optional triage pass that scores each pending proposal on those four signals and attaches a `_meta.vouch_triage` block (recommendation/score/signals/rationale) to help a reviewer prioritize, without ever deciding anything itself. read-only by construction: the pass never calls proposals.approve/reject, store.put_*, or store.move_proposal_to_decided — a human still calls kb.approve/kb.reject. citation_quality reuses proposals._payload_block_reason; duplication_risk reuses the propose-time embedding similarity path (embeddings.similarity.find_similar_on_propose) and degrades to a difflib heuristic when the embeddings extra isn't installed. fit uses a separate, lower-threshold embedding search so a near-duplicate hit doesn't also inflate fit and cancel out its own duplication penalty. opt-in via `triage.enabled: true` in config.yaml (default false). registered at all four kb.* surface sites (server.py, jsonl_server.py, capabilities.py, cli.py) plus `vouch triage [proposal-id...]` with `--json` and `--reverse`.
📝 WalkthroughWalkthroughThis PR adds an advisory Changeskb.triage_pending advisory scoring
Estimated code review effort: 3 (Moderate) | ~30 minutes Sequence Diagram(s)sequenceDiagram
participant Reviewer
participant CLI as vouch CLI / MCP / JSONL
participant triage_pending
participant score_proposal
participant KBStore
Reviewer->>CLI: request triage (proposal_ids, --json/--reverse)
CLI->>triage_pending: triage_pending(store, proposal_ids)
triage_pending->>KBStore: check triage.enabled config
alt triage disabled
triage_pending-->>CLI: raise TriageError
else triage enabled
triage_pending->>KBStore: fetch pending proposals
loop each proposal
triage_pending->>score_proposal: compute signals & score
score_proposal-->>triage_pending: score, recommendation, rationale
end
triage_pending-->>CLI: annotated proposals (_meta.vouch_triage)
CLI-->>Reviewer: ranked JSON/table output
end
Suggested reviewers: 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/vouch/triage.py (1)
451-473: 🚀 Performance & Scalability | 🔵 Trivial | ⚡ Quick winRedundant embedder fetch and corpus reads per proposal.
Within a single
score_proposalcall:_safe_embedder()is invoked once inside_embedding_hits_for_claim(line 172, just to checkis None) and again at line 459 to obtain the actual embedder used by_signal_fit. Separately, in the heuristic (no-embeddings) path,_signal_duplication_risk(Line 329) and_signal_contradiction_risk(Line 375) each independently call_claim_text_pool, which re-readsstore.list_claims()andstore.list_proposals(...)from scratch for the same proposal. Across thetriage_pendingloop (Lines 496-503) this duplicates I/O and embedder instantiation for every pending proposal.Consider computing the embedder and the claim/proposal pool once per proposal (or once per
triage_pendingcall, ifget_embedder()isn't already cached) and threading them into_embedding_hits_for_claim,_signal_fit,_signal_duplication_risk, and_signal_contradiction_riskinstead of recomputing.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/vouch/triage.py` around lines 451 - 473, score_proposal is doing redundant work by fetching the embedder and rebuilding the claim/proposal corpus multiple times for the same proposal. Compute the embedder once in score_proposal and thread it into _embedding_hits_for_claim and _signal_fit instead of calling _safe_embedder() separately, and precompute the shared claim text pool/corpus once per proposal (or per triage_pending run) so _signal_duplication_risk and _signal_contradiction_risk can reuse it rather than each calling _claim_text_pool again. Update the helper signatures accordingly and preserve the existing behavior in score_proposal and triage_pending.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/vouch/triage.py`:
- Around line 451-473: score_proposal is doing redundant work by fetching the
embedder and rebuilding the claim/proposal corpus multiple times for the same
proposal. Compute the embedder once in score_proposal and thread it into
_embedding_hits_for_claim and _signal_fit instead of calling _safe_embedder()
separately, and precompute the shared claim text pool/corpus once per proposal
(or per triage_pending run) so _signal_duplication_risk and
_signal_contradiction_risk can reuse it rather than each calling
_claim_text_pool again. Update the helper signatures accordingly and preserve
the existing behavior in score_proposal and triage_pending.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 2fc8659e-96d3-47b9-9050-0d89b1204652
📒 Files selected for processing (7)
CHANGELOG.mdsrc/vouch/capabilities.pysrc/vouch/cli.pysrc/vouch/jsonl_server.pysrc/vouch/server.pysrc/vouch/triage.pytests/test_triage.py
summary
closes #322.
a long
kb.list_pendingforces a reviewer to reconstruct, per proposal,whether the claim fits the existing kb, whether its citations resolve,
whether it duplicates something already filed, and whether it contradicts
an approved claim. those signals already exist in scattered form
(
find_similar_on_propose,proposals._payload_block_reason) but nothingsurfaces them together as a ranked, explained view.
this adds an optional read-side triage pass over the pending queue:
kb.triage_pending(proposal_ids=None)returns each pending proposal'smodel_dumpplus a_meta.vouch_triageblock:recommendation(
approve/reject/needs-human, advisory only),score(0.0-1.0),signals(fit,citation_quality,duplication_risk,contradiction_risk, each with its own score + rationale), and a shortrationalestring.vouch triage [proposal-id...]mirrors it on the cli, with--jsonand--reverse.triage.enabled: trueis set in.vouch/config.yaml; per-signaltriage.weightsare configurable.review-gate scope
read-only by construction — this is the load-bearing property the north
star in CLAUDE.md calls out. the pass never calls
proposals.approve,proposals.reject,store.put_*, orstore.move_proposal_to_decided; ahuman still calls
kb.approve/kb.reject.recommendationis anadvisory string nothing else consumes.
citation_qualityreusesproposals._payload_block_reason(the samedangling-ref / invalid-payload gate
check_approvableuses).duplication_riskreuses the propose-time embedding path(
embeddings.similarity.find_similar_on_propose) for claims, anddegrades to a
difflibtext-similarity heuristic when no embedder isregistered (base install, no
[embeddings]extra) — the block shapestays the same either way.
fitruns its own lower-threshold embedding search(
index_db.search_embedding) rather than reusing the near-duplicate-onlyhits from
find_similar_on_propose— reusing those directly would let aliteral duplicate's high "fit" score cancel out its own
duplication_riskpenalty in the composite.contradiction_risklooks for topically-related approved claims thatshare an entity with the proposal but disagree on a simple negation-word
signal (heuristic, advisory only — same caveats as the other signals).
registered at all four
kb.*surface sites:server.py(@mcp.tool()),jsonl_server.py(_h_triage_pending+HANDLERS),capabilities.py(
METHODS), andcli.py(vouch triage).test plan
tests/test_triage.py(25 tests): output shape, the no-writeinvariant (pending proposals stay pending, nothing approved/rejected/
created), the disabled-by-default opt-in gate, citation_quality
forcing
rejecton a dangling-ref proposal, duplication_risk on boththe heuristic and embedding backends (
--backend heuristicconfigoverride too), fit's entity-overlap + topical scoring, contradiction_risk's
polarity-conflict heuristic,
proposal_idsfiltering, config/weightsplumbing, and cli / jsonl wiring
.venv/bin/python -m pytest tests/ -q --ignore=tests/embeddings— fullsuite green apart from 7 pre-existing Windows-only failures (verified
identical on a clean
maincheckout —os.getuid(), symlinkprivilege, and path-separator assertions) and 2 pre-existing hangs in
test_http_server*.py(also reproduce on a clean checkout, unrelatedto this change)
.venv/bin/python -m mypy src— clean (same 2 pre-existing Windows/missing-stub errors on a clean checkout)
.venv/bin/python -m ruff check src tests— cleanSummary by CodeRabbit
New Features
Tests