feat/issue65 llm providers by ar7casper · Pull Request #69 · knostic/OpenAnt

ar7casper · 2026-05-24T12:31:59Z

feat(openant-core,openant-cli): pluggable LLM adapter framework + multi-provider support (issue [feat] Enable OpenAnt to work with non-Claude LLMs (Qwen, Kimi, local models, etc.) #65)
docs: README + CHANGELOG for multi-provider work

…ti-provider support (issue #65) Replaces the hardcoded AnthropicClient wrapper used across 15+ files with a Protocol-based adapter layer. Each pipeline phase (analyze, enhance, verify, report, dynamic_test, llm_reach, app_context) resolves to a (provider, model) pair via a registry built from ~/.config/openant/config.json. Adapter framework (libs/openant-core/utilities/llm/): * adapter.py LLMAdapter Protocol, unified content blocks, 5-class error taxonomy * config.py v2 schema with v1->v2 migration of legacy api_key * builtins.py frozen openant-default config pinning today's Claude defaults * registry.py PhaseRegistry with eager adapter instantiation + probe validation * helpers.py simple_text + lookup_pricing shortcuts Shipped adapters (providers/): * anthropic.py reference impl, full tool calling, per-model pricing * openai.py Chat Completions, splits one-user-with-N-tool-results into N role=tool messages * google.py Gemini via google-genai, function_call/function_response Part translation All three support tool calling. 33-test contract harness runs against all three. Setup wizard (apps/openant-cli/cmd/setup.go): * `openant setup llm` walks per-phase provider+model selection * Per-phase per-provider model defaults + known-models hint * 1-token probe per unique (provider, model) before save * OpenAI ChatGPT/Codex-vs-API-key heads-up Go CLI: * --model opus/sonnet replaced by --llm-config <name> on analyze/scan/enhance/verify/dynamic-test/report * internal/config/config.go gains WriteLLMConfig + raw-map round-trip preserving v2 fields * set-api-key refactored to share the probe helper now used by the wizard Pipeline migration: * Every AnthropicClient call site (15+ files) replaced with PhaseBinding threading * Scanner builds the registry once and probes upfront; standalone verbs build their own * Defensive _coerce_to_str in reporter so non-string fields from GPT-style responses don't crash report generation (analyze prompt also tightened to require string types) Closes #65. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

README: * Replace "Using a non-Anthropic provider" with "Setting up an LLM" that leads with the `openant setup llm` wizard * Shipped-adapters table (Anthropic, OpenAI, Google) with subscription-vs-API caveats per provider * 7-phase sample config (with `app_context`) * New Roadmap section: more adapters, OAuth subscription auth, cross-provider tool-call quirks, more languages, hosted scan service CHANGELOG: bring [2026-05-24] entry in line with reality — add OpenAI/Google adapters, the setup wizard, the contract harness, the reporter coercion fix, and the seventh `app_context` phase that the original entry didn't mention. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…layer Critical/High: - C1: carry the tool name on ToolResultBlock so the Gemini adapter sends the real function name on function_response (Gemini matches by name, not the synthesised gemini_<name>_<idx> id) - H1: wire OpenAI + Google adapters into the global rate limiter via a shared providers/_ratelimit helper (was Anthropic-only); fix the two false "adapter handles it internally" caller comments - H3: send max_completion_tokens for OpenAI reasoning models (o1/o3/o4) - H5: warn once on malformed tool_call.arguments instead of silently collapsing to {} Medium/Low: - M6: warn once when the Anthropic adapter drops an unknown block kind - M7: reset_warning_state() clears every per-adapter warn set, wired into reset_global_tracker() - M9: drift-guard test pins MODEL_PRICING to AnthropicAdapter.pricing - M1: remove dead imports (agent LLMRateLimitError, finding_verifier LLMError + orphaned get_rate_limiter) - M10: drop the stale `model: opus|sonnet` docstring in scan_repository - config _optional_str raises on non-strings; llm_reachability re-raises LLMAuthError; report _extract_usage warns on missing pricing; hoist registry import sys Tests: 344 passed, 72 skipped; ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- C2: redact ?key= from Gemini probe transport errors before they reach stderr (was leaking the API key on any network failure) - H3: send max_completion_tokens in the OpenAI probe for o1/o3/o4 models - H2: document that the v2 per-provider key check is intentionally left to Python's registry.validate() at scan start - M3: accept the key when set-api-key's probe returns 404 (model_not_found) — auth already succeeded - M4: write config atomically (temp file + rename), with a Windows-safe backup/restore so a failed replace never loses the original - M5: merge provider entries in WriteLLMConfig, preserving hand-authored sibling fields - M8: delete the tautological adapterShipped() + dead caller; fix the stale "only anthropic ships" comments Tests: go vet + build + test ./... all pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…erifier truncation guard Round-2 review found adapter defects the prior review missed: - Gemini output tokens now count candidates + thoughts_token_count, so cost isn't undercounted for thinking models (gemini-2.5-*) - Gemini empty-candidates (a prompt-level safety block) now raises LLMResponseError with the block reason instead of returning a silent empty end_turn that read as a clean pass - finding_verifier mirrors the enhancer's guard: a truncated response (max_tokens, no tool call) returns "incomplete" instead of sending an empty user message that the next complete() rejects - both agentic loops echo only Text+ToolUse blocks (future block-kind re-serialization safety) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…olish - setup.go: skip the per-provider probe when the key is blank, so the wizard's "leave blank to read from environment" option actually completes instead of dead-ending on a 401 - llm_probe_google.go: tighten the key-redaction regex so it stops at the URL's closing quote instead of swallowing the delimiter - root.go: fix a stale doc reference (requireAPIKeyFor -> requireAPIKey) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…kKey Pre-existing issues surfaced during the PR #69 round-2 review (not introduced by that PR), fixed here as security/robustness hardening: - repository_index.read_file_section now confines the model-controlled file_path to the repo root (is_relative_to, which also collapses symlink escapes) — closes an arbitrary host-file read via the agent's read_file_section tool - config.MaskKey returns "****" for keys shorter than 8 chars — no more out-of-range panic via `config set api-key --stdin` -> `config show`, and no full-key reveal for 3-5 char keys - config.go: the *Names doc comments no longer claim "sorted" (they don't) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ntry-point bindings Round-3 review fixes (test-first, all proven): - H1: restore report phase default to Opus (claude-opus-4-6); it was silently downgraded to Sonnet, contradicting the no-behavior-change guarantee - H2: fix ContextEnhancer() callers (5 parser pipelines + context_enhancer __main__) to build a registry and pass the enhance binding - H3: OpenAI adapter routes system->developer for reasoning models; drop o1-mini/o1-preview; add gpt-4.1/o3/o4-mini pricing; guard empty choices - M1: raise dependency floors (openai>=1.45.0, google-genai>=1.0.0) - M2: python -m report summary/disclosures pass the now-required binding - M3: reporter coerces the data_flow container before iterating - M4 (py): report-data subparser registers --llm-config - M5: config validation exempts anthropic for the env-key path - L1/L4/L5/L10: dynamic_tester binding threading, error diagnostics, stale docstrings, tightened assertion Python: 390 passed, 72 skipped; ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…g, set-api-key, probe test Round-3 review fixes (test-first): - H3 (Go): drop o1-mini from the wizard known OpenAI models, matching the Python adapter - M4 (Go): runHTMLReport forwards --llm-config via the new buildReportDataArgs helper - L7: cover probeAllPhases returning an error on a failed probe - L8: set-api-key soft-passes transient 429/5xx (only 401/403 reject a key) - L9: remove the dangling openant llm-config list reference from scan help go test ./... ok; go vet/build clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- M6: correct the four-file-recipe phrasing — adding a provider also needs Go-side wizard/probe wiring - L6: fix the contract-test count (12 methods / 36 cases across 3 adapters) (The companion HOW_TO_ADD_AN_ADAPTER.md also gained a Go-wiring section, but it lives under the gitignored docs/ path and stays local.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

gadievron · 2026-06-15T04:41:28Z

Reviewed the adapter framework in depth (multiple passes, plus a head-to-head against another project that did this same native-SDK migration). Architecture is solid — clean Protocol, PhaseBinding wiring, backward-compat preserved, credentials baseline sound (0600, masking, traversal-blocked, the Go side redacts key= from probe URLs). Good call using native SDKs rather than a gateway lib. Static analysis (semgrep + CodeQL, security + quality, both languages) is clean, so the items below are behavioral asymmetries between the three adapters, not anything a scanner flags.

In the new adapter framework — worth fixing:

HIGH — Anthropic has no empty-content guard. A well-formed refusal/empty completion yields content=(), stop_reason=end_turn, simple_text() -> ''. openai.py:365 (empty choices) and google.py:346 (empty candidates) both raise LLMResponseError; anthropic.py:305-356 doesn't. Because the verify phase then defaults to agreeing with Stage 1 on empty text, a refusal can silently become a clean security verdict (false negative). Suggest mirroring the OpenAI/Google guard in AnthropicAdapter._response_to_unified.
MED — refusal / content-filter is normalized to end_turn (warn-only) in all three adapters. For a SAST tool a masked refusal is a masked false-negative. A typed LLMRefusalError raised on refusal finish reasons keeps it from reading as a normal short answer (distinct from the empty-guard above — this fires on a populated refusal).
MED — Google max_retries is silently dropped (google.py:158 _ = max_retries). The docstring says the SDK doesn't expose retries, but HttpOptions.retry_options=HttpRetryOptions(attempts=...) does (and you already build HttpOptions at :154). Anthropic/OpenAI forward it; Google should too — and the docstring is misleading.
MODERATE — the setup wizard echoes the pasted API key. setup.go:412 -> readLine (bufio.ReadString) prints input as typed; there's no term.ReadPassword in cmd/ (charmbracelet/x/term is already an indirect dep). set-api-key <key> also takes the key as an argv (shell history / ps). A no-echo read closes both.
LOW — Anthropic reads response.usage unguarded (anthropic.py:329, outside the try) -> a usage-less response raises an unmapped AttributeError; OpenAI :422 / Google :360 use getattr(..., None).
LOW — error mapping passes raw exception strings through. All three adapters do raise LLMError(str(exc)) with no redaction layer; provider 400/401 bodies can echo request context. Worth scrubbing secrets before these propagate to logs/reports.

Several of these are the same shape — handled in two adapters, missed in the third (Anthropic is the odd one out on the empty-guard and on usage; Google on retries) — so they're cheap to fix together.

Two pre-existing issues this PR's completion path now feeds (flagging because they're high-impact, not asking you to redesign here): the verify phase defaults to "agree with Stage 1" on every degenerate path — incomplete/empty completion (finding_verifier.py:380, where the empty-completion case lands), no tool calls (:448), max-iterations reached (:464), and even a finish call that omits the agree field (:925, get("agree", True)). For a security verifier that biases systematically toward false negatives, so the empty-completion case above is one of four inputs that silently rubber-stamp Stage 1. Separately, untrusted analyzed source is interpolated into the verifier prompt inside a bare triple-backtick fence (verification_prompts.py:106-125) with no escaping, so a crafted source file can break out of the fence and steer the verdict. Both are worth a fix independent of this PR.

Happy to share repros for any of these.

… handling - Anthropic: empty-content guard, usage getattr, refusal stop_reason -> LLMRefusalError - OpenAI: content_filter finish_reason -> LLMRefusalError - Google: SAFETY/RECITATION/... -> LLMRefusalError; forward max_retries via HttpRetryOptions(attempts=max_retries+1) for parity with the other adapters - New LLMRefusalError(LLMResponseError) in the taxonomy - New _redact.py: scrub provider keys from adapter error strings and the chained cause (RedactedCause) so secrets don't leak via tracebacks/logs Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…erdict - finding_verifier: degenerate verify paths no longer auto-agree with Stage 1; mark VerificationResult.incomplete and set result["error"] on adapter raise - reporter: incomplete -> stage2_verdict "unverified" (not "rejected"); "unverified" is disclosure-eligible so a couldn't-verify finding still surfaces - verifier/scanner/schemas: needs_review counter; degenerate findings no longer folded into the "safe" count - report/{generator,__main__}: align disclosure gate with "unverified" Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ompts - New prompts/_fence.py safe_code_fence(): opens a fence longer than any backtick run in the untrusted source (CommonMark), so crafted source can't escape the fence and steer the analyst/verifier verdict; None-tolerant - verification_prompts (Stage 2) and vulnerability_analysis (Stage 1) both use it Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…pi-key - promptSecret() reads the key without echo via charmbracelet/x/term (promoted to a direct dep), falling back to the reader when stdin isn't a TTY - set-api-key [key] arg now optional -> no-echo prompt; help warns that passing the key as an argument leaks it via shell history / ps Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The redact_secrets() tests must feed key-shaped inputs, which the default gitleaks generic-api-key rule flags even though no real secret exists. Add a .gitleaks.toml that keeps the full default ruleset (useDefault) and narrowly allowlists ONLY secrets that both (a) live in the two redaction-test files and (b) carry an unmistakable fake marker (1234567890 / LEAKED / EXAMPLE / FAKE). Verified the AND condition is enforced: the same marker elsewhere is still reported. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

ar7casper and others added 2 commits May 24, 2026 15:08

ar7casper requested review from dgeyshis, shahar-davidson, sounil and yotamleo as code owners May 24, 2026 12:32

This was referenced May 28, 2026

fix(cli/config): enforce 0600 on the config file even when it pre-exists (CWE-732) #82

Open

fix(scanner): isolate optional-stage failures and disambiguate skip causes #88

Open

ar7casper and others added 8 commits June 8, 2026 13:09

ar7casper and others added 4 commits June 16, 2026 11:37

ar7casper requested a review from gadievron as a code owner June 16, 2026 08:39

ar7casper merged commit 49d9802 into master Jun 16, 2026
9 checks passed

ar7casper deleted the feat/issue65-llm-providers branch June 16, 2026 17:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat/issue65 llm providers#69

feat/issue65 llm providers#69
ar7casper merged 15 commits into
masterfrom
feat/issue65-llm-providers

ar7casper commented May 24, 2026

Uh oh!

gadievron commented Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ar7casper commented May 24, 2026

Uh oh!

gadievron commented Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants