Skip to content

feat(ai): ModelGateway v2 shadow routing + model registry (Phase 12, AI-075 slice 1)#366

Merged
mrviduus merged 2 commits into
mainfrom
phase12-shadow-routing
Jun 18, 2026
Merged

feat(ai): ModelGateway v2 shadow routing + model registry (Phase 12, AI-075 slice 1)#366
mrviduus merged 2 commits into
mainfrom
phase12-shadow-routing

Conversation

@mrviduus

Copy link
Copy Markdown
Owner

Phase 12 (RLOps) — slice 1: shadow routing

The gateway can now shadow a second model against the one serving production, with zero user impact.

When a feature has an Ai:Shadow route configured and the call is sampled, ModelGatewayafter the primary response is ready — fires the same LlmRequest at the shadow provider's untraced -raw sibling (no llm_traces row, no recursion, no double cost-count) as a fire-and-forget background task, then persists one redacted primary-vs-shadow row in shadow_runs (both responses, latency, cost, tokens, trace ids).

Invariants (unit-tested)

  • Primary latency/correctness untouched (shadow is _ = Task.Run, never awaited).
  • Shadow never threads the caller's cancellation token — own timeout (default 15s).
  • Any shadow failure/timeout swallowed + logged, never surfaced.
  • StreamAsync re-yields primary deltas unchanged & ordered; shadows once only on clean stream completion (suppressed on mid-stream throw); empty stream still shadows once.
  • Shadow resolves the -raw provider → no double-trace / double-cost.

Model registry

New models table records which (provider, model) serves each feature by lifecycle Status (Primary/Shadow/Retired). Seeded idempotently at startup from current primary routes; unique natural-key index makes the seed race-safe across replicas; the seeder is guarded so a DB hiccup can't abort API boot. Audit/seed only this slice — the gateway still routes by config.

Safety

Shadow is OFF by default (no Ai:Shadow:Routes, sample rate 0.0) — no paid background calls until explicitly enabled per feature.

Out of scope (later slices)

Table-driven hot-swap · canary · escalate-on-confidence · cost-cap · drift detection · admin UI (Models/Shadow/Drift tabs).

Verification

691 unit tests green (incl. ct-isolation + empty-stream edge cases added in adversarial QA); Api build clean; ef migrations script --idempotent emits models + shadow_runs. Architect → backend → adversarial QA (verdict SHIP) → P1 seeder-guard + P2 unique-index applied.

🤖 Generated with Claude Code

mrviduus and others added 2 commits June 18, 2026 09:36
…ase 12)

First RLOps slice. The gateway can shadow a second model against the one
serving production, with zero user impact.

- ModelGateway v2: when a feature has an Ai:Shadow route + the call is
  sampled, AFTER the primary response is ready, fire the same request at the
  shadow provider's untraced '-raw' sibling (no llm_traces row, no recursion,
  no double cost) fire-and-forget, then persist a redacted primary-vs-shadow
  row in shadow_runs.
- Invariants (unit-tested): primary latency/correctness untouched; shadow
  never threads the caller's ct (own 15s timeout); failure/timeout swallowed;
  StreamAsync re-yields primary deltas unchanged, shadows once only on clean
  completion (suppressed on mid-stream throw).
- models registry table: which (provider, model) serves each feature by
  Status (Primary/Shadow/Retired). Seeded idempotently at startup from current
  primary routes; unique natural-key index => seed race-safe across replicas;
  seeder guarded so a DB hiccup can't abort boot. Audit/seed only this slice —
  gateway still routes by config.
- Shadow OFF by default (no routes, 0.0 sample) — no paid background calls
  until enabled per feature.

Out of scope (later slices): table-driven hot-swap, canary, escalate,
cost-cap, drift detection, admin UI.

691 unit tests green; Api build clean; migration script idempotent.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…s, Models)

AiEvals test double wasn't built locally; CI backend job caught the two
missing interface members.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mrviduus mrviduus merged commit 04a458e into main Jun 18, 2026
5 checks passed
@mrviduus mrviduus deleted the phase12-shadow-routing branch June 18, 2026 13:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant