Function-call edits (structured replacements, not full rewrite)#31
Function-call edits (structured replacements, not full rewrite)#31bddap-bot wants to merge 4 commits into
Conversation
refac now defaults to the Claude Messages API (model claude-opus-4-8); OpenAI remains available via `provider = "openai"`. New `src/anthropic.rs` talks to the REST API with reqwest (no official Rust SDK): x-api-key + anthropic-version headers, top-level `system`, and required `max_tokens`. It adapts refac's flat message list to Anthropic's shape — lifts the system prompt out of messages and merges the consecutive user turns to satisfy user/assistant alternation — and marks the static system prompt + few-shot examples `cache_control: ephemeral` so repeat calls only pay for the varying input. Config gains `provider`, optional `model` (defaulted per provider), and `max_tokens`; secrets hold either/both keys. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Some few-shot samples have an empty `selected`; Anthropic rejects empty text content blocks (OpenAI tolerated them), so skip empty messages when building the request. Add a REFAC_DEBUG env that dumps the request JSON. Verified end-to-end: 'Me like toast.' / 'Correct grammar.' -> 'I like toast.'
Adds an `apply_edits` tool the model must call, returning a list of exact substring replacements that refac applies to the input — so it no longer re-emits the entire selected text. New default on the Anthropic provider (edit_mode = "tool"); set edit_mode = "rewrite" (or REFAC_EDIT_MODE=rewrite) for the old behavior. OpenAI always rewrites. - anthropic.rs: `request_edits` forces `tool_choice` to `apply_edits`, parses the tool_use input into `Vec<Edit>`. HTTP factored into a shared `send`. - prompt.rs: edit-mode system prompt (no rewrite few-shot). - main.rs: `apply_edits` applies edits in order, erroring if a target isn't found. Live-tested against the API: "Me like toast."/"Correct grammar." -> "I like toast."; "add a docstring" inserts one without touching the rest. cargo test green (6 tests incl edit application + missing-target error). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two rounds of concurrent reviewers (taste/Niko, correctness/API, Rust idiom); findings triaged and the real ones fixed: - apply_edits: resolve every `old` against the ORIGINAL text and reject empty, missing, ambiguous (>1 occurrence), or overlapping edits; apply right-to-left. Edits are now order-independent — closes a silent buffer-corruption hazard. - Check `stop_reason`: error on `max_tokens` truncation in complete + request_edits (a truncated rewrite or half-finished tool call was silently returned/mis-parsed). - login: `--provider` flag + state which provider's key is being saved (avoids an upgrade footgun where an OpenAI key gets stored as the Anthropic one). - Unknown REFAC_PROVIDER / REFAC_EDIT_MODE now error instead of silently defaulting. - Unify the tool name to APPLY_EDITS; rewrite the tool/system prompt to match the new unique-and-independent edit contract. Delete dead OpenAI edits structs (clears all build warnings). Extract DEFAULT_ANTHROPIC_MODEL. - Tests: ambiguous/empty/overlapping edits + resolve-against-original. 10 passing. Re-tested live against the API after the changes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Time for a rewrite atop the merged anthropic pr. Consider including a wide-scope refactor before this pr to pave the way for this change. Common interface to both providers might be needed, put your software design hat on. |
|
Plan, software-design hat on. Splitting into two stacked PRs as you suggested: PR A — refactor: a common
PR B — rewrite #31 atop A. Function-call edits become the natural second capability on the interface (Anthropic-only Starting on A now. Will run my reviewer loop before flagging either for your review. |
|
I don't think you'll want to write the funtion-call loop twice. Consider a common abstraction that takes function-calling into account. |
|
PR A is up: #32 (the paving refactor, no behavior change). PR B = this branch rewritten atop it. |
|
Right, and it corrects a wrong assumption I baked into A. Function-calling isn't Anthropic-only — OpenAI has tools too. So the abstraction shouldn't be an "Anthropic-only edits" capability; it should make a single tool-call round-trip a first-class Revised design for B: trait Backend {
fn complete(&self, &[Message]) -> Result<String>;
/// One forced tool-call round-trip: hand the model a tool schema, get back
/// the parsed call arguments. Each provider does this in its own wire format
/// (Anthropic tools + tool_choice; OpenAI tools + tool_choice "required").
fn call_tool(&self, &[Message], tool: &Tool) -> Result<serde_json::Value>;
}
Noodle away on A. This is the shape I'll build B in once you're settled. |
|
Ran the debate club — three advocates (sans-io step-enum / minimal two-method / protocol-transport split). Synthesis + recommendation. Verdict: protocol/transport split (true sans-io) + an
|
|
Round 2 — open ideation (3 agents: question-the-premise / type-leverage / orchestration). It found real design-space we'd skipped. The headline: round 1 argued about the Backend trait; round 2 says the higher-leverage axes are the edit mechanism, the apply robustness, and making the seam data. The trait shape is downstream of those. Three ideas worth more than the trait debate1. Maybe don't use function-calling at all. Two alternatives to "model calls an
Both dodge the thing that makes tool-edits fragile: the model has to echo 2. Correctness lives in the apply step, not the trait — and we all ignored it. Whatever produces struct Plan<'src> { src: &'src str, spans: Vec<(Range<usize>, String)> }
impl<'src> Plan<'src> {
fn resolve(src: &'src str, edits: Vec<RawEdit>) -> Result<Plan, EditError>; // only constructor; fails on NotFound/Ambiguous/Overlap
fn apply(self) -> String; // total, no Result — illegal states already gone
}Highest correctness-per-line, orthogonal to every transport decision. Adopt regardless. 3. The seam should be data, not a behavior — and the cleanest sans-io is a resumable state machine. Instead of enum Step { Call(HttpRequest), Done(String) }
impl Refac {
fn start(sel, instr) -> (Self, Step);
fn resume(self, resp: HttpResponse) -> Result<(Self, Step)>;
}
// driver: while let Step::Call(req) = step { step = m.resume(send(req)?)?.1 }Single-shot today = one Two sharper consequences
Where this lands the recommendationThe round-1 verdict (sans-io +
Want me to (i) build a quick apply-success bake-off (SEARCH/REPLACE vs tool-call) over the logs, or (ii) just commit to one mechanism and build it? And the apply-error (a)/(b) + PR-A-fold questions from the last comment still stand. |
Edit-tool design (stolen from opencode + claude-code)Researched both. The split is clean: claude-code owns the tool contract (exact match, unique-or-error, The contract (from claude-code's
|
|
#33 supersedes this — they're a prototype and its finished version, not two of the same thing. The lineage:
So #33 fully subsumes #31. Closing this in favor of #33 — reopen if you'd rather land the simpler batch-edit version instead. |
Makes refac express changes as function-call edits instead of re-emitting the whole input, as requested.
Stacks on #30 (Anthropic backend) — merge that first, or merge this to get both.
How it works
apply_editstool returning a list of exact-substring replacements ({old, new}).apply_editsinmain.rs), erroring if anoldisn't found verbatim — so a stale/hallucinated edit fails loudly instead of silently no-op'ing.edit_mode = "tool". Setedit_mode = "rewrite"(orREFAC_EDIT_MODE=rewrite) for the previous full-rewrite behavior. The OpenAI path always rewrites.Details
anthropic.rs:request_edits()adds the tool + forcestool_choice, parses thetool_useinput intoVec<Edit>; HTTP factored into a sharedsend().prompt.rs: an edit-focused system prompt (keeps refac's personality; no rewrite few-shot).Testing (live, with the provided key)
refac tor "Me like toast." "Correct grammar."→I like toast.(one substring edit)refac tor "<def add…>" "add a docstring"→ inserts a docstring, rest untouched:cargo test→ 6 passing (edit application order, insert-via-anchor, delete-via-empty, missing-target error, request shaping).🤖 Generated with Claude Code