feat/issue65 llm providers#69
Conversation
ar7casper
commented
May 24, 2026
- feat(openant-core,openant-cli): pluggable LLM adapter framework + multi-provider support (issue [feat] Enable OpenAnt to work with non-Claude LLMs (Qwen, Kimi, local models, etc.) #65)
- docs: README + CHANGELOG for multi-provider work
…ti-provider support (issue #65) Replaces the hardcoded AnthropicClient wrapper used across 15+ files with a Protocol-based adapter layer. Each pipeline phase (analyze, enhance, verify, report, dynamic_test, llm_reach, app_context) resolves to a (provider, model) pair via a registry built from ~/.config/openant/config.json. Adapter framework (libs/openant-core/utilities/llm/): * adapter.py LLMAdapter Protocol, unified content blocks, 5-class error taxonomy * config.py v2 schema with v1->v2 migration of legacy api_key * builtins.py frozen openant-default config pinning today's Claude defaults * registry.py PhaseRegistry with eager adapter instantiation + probe validation * helpers.py simple_text + lookup_pricing shortcuts Shipped adapters (providers/): * anthropic.py reference impl, full tool calling, per-model pricing * openai.py Chat Completions, splits one-user-with-N-tool-results into N role=tool messages * google.py Gemini via google-genai, function_call/function_response Part translation All three support tool calling. 33-test contract harness runs against all three. Setup wizard (apps/openant-cli/cmd/setup.go): * `openant setup llm` walks per-phase provider+model selection * Per-phase per-provider model defaults + known-models hint * 1-token probe per unique (provider, model) before save * OpenAI ChatGPT/Codex-vs-API-key heads-up Go CLI: * --model opus/sonnet replaced by --llm-config <name> on analyze/scan/enhance/verify/dynamic-test/report * internal/config/config.go gains WriteLLMConfig + raw-map round-trip preserving v2 fields * set-api-key refactored to share the probe helper now used by the wizard Pipeline migration: * Every AnthropicClient call site (15+ files) replaced with PhaseBinding threading * Scanner builds the registry once and probes upfront; standalone verbs build their own * Defensive _coerce_to_str in reporter so non-string fields from GPT-style responses don't crash report generation (analyze prompt also tightened to require string types) Closes #65. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
README: * Replace "Using a non-Anthropic provider" with "Setting up an LLM" that leads with the `openant setup llm` wizard * Shipped-adapters table (Anthropic, OpenAI, Google) with subscription-vs-API caveats per provider * 7-phase sample config (with `app_context`) * New Roadmap section: more adapters, OAuth subscription auth, cross-provider tool-call quirks, more languages, hosted scan service CHANGELOG: bring [2026-05-24] entry in line with reality — add OpenAI/Google adapters, the setup wizard, the contract harness, the reporter coercion fix, and the seventh `app_context` phase that the original entry didn't mention. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…layer
Critical/High:
- C1: carry the tool name on ToolResultBlock so the Gemini adapter sends
the real function name on function_response (Gemini matches by name,
not the synthesised gemini_<name>_<idx> id)
- H1: wire OpenAI + Google adapters into the global rate limiter via a
shared providers/_ratelimit helper (was Anthropic-only); fix the two
false "adapter handles it internally" caller comments
- H3: send max_completion_tokens for OpenAI reasoning models (o1/o3/o4)
- H5: warn once on malformed tool_call.arguments instead of silently
collapsing to {}
Medium/Low:
- M6: warn once when the Anthropic adapter drops an unknown block kind
- M7: reset_warning_state() clears every per-adapter warn set, wired
into reset_global_tracker()
- M9: drift-guard test pins MODEL_PRICING to AnthropicAdapter.pricing
- M1: remove dead imports (agent LLMRateLimitError, finding_verifier
LLMError + orphaned get_rate_limiter)
- M10: drop the stale `model: opus|sonnet` docstring in scan_repository
- config _optional_str raises on non-strings; llm_reachability re-raises
LLMAuthError; report _extract_usage warns on missing pricing; hoist
registry import sys
Tests: 344 passed, 72 skipped; ruff clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- C2: redact ?key= from Gemini probe transport errors before they reach stderr (was leaking the API key on any network failure) - H3: send max_completion_tokens in the OpenAI probe for o1/o3/o4 models - H2: document that the v2 per-provider key check is intentionally left to Python's registry.validate() at scan start - M3: accept the key when set-api-key's probe returns 404 (model_not_found) — auth already succeeded - M4: write config atomically (temp file + rename), with a Windows-safe backup/restore so a failed replace never loses the original - M5: merge provider entries in WriteLLMConfig, preserving hand-authored sibling fields - M8: delete the tautological adapterShipped() + dead caller; fix the stale "only anthropic ships" comments Tests: go vet + build + test ./... all pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…erifier truncation guard Round-2 review found adapter defects the prior review missed: - Gemini output tokens now count candidates + thoughts_token_count, so cost isn't undercounted for thinking models (gemini-2.5-*) - Gemini empty-candidates (a prompt-level safety block) now raises LLMResponseError with the block reason instead of returning a silent empty end_turn that read as a clean pass - finding_verifier mirrors the enhancer's guard: a truncated response (max_tokens, no tool call) returns "incomplete" instead of sending an empty user message that the next complete() rejects - both agentic loops echo only Text+ToolUse blocks (future block-kind re-serialization safety) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…olish - setup.go: skip the per-provider probe when the key is blank, so the wizard's "leave blank to read from environment" option actually completes instead of dead-ending on a 401 - llm_probe_google.go: tighten the key-redaction regex so it stops at the URL's closing quote instead of swallowing the delimiter - root.go: fix a stale doc reference (requireAPIKeyFor -> requireAPIKey) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…kKey Pre-existing issues surfaced during the PR #69 round-2 review (not introduced by that PR), fixed here as security/robustness hardening: - repository_index.read_file_section now confines the model-controlled file_path to the repo root (is_relative_to, which also collapses symlink escapes) — closes an arbitrary host-file read via the agent's read_file_section tool - config.MaskKey returns "****" for keys shorter than 8 chars — no more out-of-range panic via `config set api-key --stdin` -> `config show`, and no full-key reveal for 3-5 char keys - config.go: the *Names doc comments no longer claim "sorted" (they don't) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ntry-point bindings Round-3 review fixes (test-first, all proven): - H1: restore report phase default to Opus (claude-opus-4-6); it was silently downgraded to Sonnet, contradicting the no-behavior-change guarantee - H2: fix ContextEnhancer() callers (5 parser pipelines + context_enhancer __main__) to build a registry and pass the enhance binding - H3: OpenAI adapter routes system->developer for reasoning models; drop o1-mini/o1-preview; add gpt-4.1/o3/o4-mini pricing; guard empty choices - M1: raise dependency floors (openai>=1.45.0, google-genai>=1.0.0) - M2: python -m report summary/disclosures pass the now-required binding - M3: reporter coerces the data_flow container before iterating - M4 (py): report-data subparser registers --llm-config - M5: config validation exempts anthropic for the env-key path - L1/L4/L5/L10: dynamic_tester binding threading, error diagnostics, stale docstrings, tightened assertion Python: 390 passed, 72 skipped; ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…g, set-api-key, probe test Round-3 review fixes (test-first): - H3 (Go): drop o1-mini from the wizard known OpenAI models, matching the Python adapter - M4 (Go): runHTMLReport forwards --llm-config via the new buildReportDataArgs helper - L7: cover probeAllPhases returning an error on a failed probe - L8: set-api-key soft-passes transient 429/5xx (only 401/403 reject a key) - L9: remove the dangling openant llm-config list reference from scan help go test ./... ok; go vet/build clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- M6: correct the four-file-recipe phrasing — adding a provider also needs Go-side wizard/probe wiring - L6: fix the contract-test count (12 methods / 36 cases across 3 adapters) (The companion HOW_TO_ADD_AN_ADAPTER.md also gained a Go-wiring section, but it lives under the gitignored docs/ path and stays local.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Reviewed the adapter framework in depth (multiple passes, plus a head-to-head against another project that did this same native-SDK migration). Architecture is solid — clean In the new adapter framework — worth fixing:
Several of these are the same shape — handled in two adapters, missed in the third (Anthropic is the odd one out on the empty-guard and on Two pre-existing issues this PR's completion path now feeds (flagging because they're high-impact, not asking you to redesign here): the verify phase defaults to "agree with Stage 1" on every degenerate path — incomplete/empty completion ( Happy to share repros for any of these. |
… handling - Anthropic: empty-content guard, usage getattr, refusal stop_reason -> LLMRefusalError - OpenAI: content_filter finish_reason -> LLMRefusalError - Google: SAFETY/RECITATION/... -> LLMRefusalError; forward max_retries via HttpRetryOptions(attempts=max_retries+1) for parity with the other adapters - New LLMRefusalError(LLMResponseError) in the taxonomy - New _redact.py: scrub provider keys from adapter error strings and the chained cause (RedactedCause) so secrets don't leak via tracebacks/logs Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…erdict
- finding_verifier: degenerate verify paths no longer auto-agree with Stage 1;
mark VerificationResult.incomplete and set result["error"] on adapter raise
- reporter: incomplete -> stage2_verdict "unverified" (not "rejected");
"unverified" is disclosure-eligible so a couldn't-verify finding still surfaces
- verifier/scanner/schemas: needs_review counter; degenerate findings no longer
folded into the "safe" count
- report/{generator,__main__}: align disclosure gate with "unverified"
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ompts - New prompts/_fence.py safe_code_fence(): opens a fence longer than any backtick run in the untrusted source (CommonMark), so crafted source can't escape the fence and steer the analyst/verifier verdict; None-tolerant - verification_prompts (Stage 2) and vulnerability_analysis (Stage 1) both use it Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…pi-key - promptSecret() reads the key without echo via charmbracelet/x/term (promoted to a direct dep), falling back to the reader when stdin isn't a TTY - set-api-key [key] arg now optional -> no-echo prompt; help warns that passing the key as an argument leaks it via shell history / ps Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The redact_secrets() tests must feed key-shaped inputs, which the default gitleaks generic-api-key rule flags even though no real secret exists. Add a .gitleaks.toml that keeps the full default ruleset (useDefault) and narrowly allowlists ONLY secrets that both (a) live in the two redaction-test files and (b) carry an unmistakable fake marker (1234567890 / LEAKED / EXAMPLE / FAKE). Verified the AND condition is enforced: the same marker elsewhere is still reported. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>