Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,14 @@

## [Unreleased]

### Phase 12 — admin Shadow + Models tabs (AI-076, slice 2) (2026-06-18)

Makes the slice-1 shadow data visible. Two **read-only** tabs on the admin AI-quality page (`/ai-quality`):
- **Shadow** — rolls up `shadow_runs` into per-(feature, primary-model, shadow-model) pairs via a single PostgreSQL `GROUP BY` (no N+1; `percentile_cont` p50 latency, SQL-side **lexical agreement**: exact-match rate + avg length-ratio + both-present rate over rows where both responses exist). Shadow-minus-primary latency/cost/token deltas + a window-normalized **projected monthly cost delta** are computed in a pure, unit-tested `ToPairDto` helper. Row → modal with paged side-by-side primary-vs-shadow response samples (redacted at write-time; limit clamped 1..50). A caption states agreement is **lexical, not a quality verdict** — semantic judge scoring is a later slice. Empty state explains shadow is OFF by default + how to enable it.
- **Models** — flat view of the seeded `models` registry (feature, provider, model, status), status color-coded Primary/Shadow/Retired.

Backend: 3 endpoints under `/admin/ai-quality` (`GET /shadow/summary`, `/shadow/samples`, `/models`), DTOs appended to the Ai-quality contract, admin-auth inherited. **Strictly read-only** — promote/rollback + table-driven routing is the next slice. Backend `AdminAiQualityEndpoints.cs` + `AiQualityDtos.cs`; admin `AiQualityPage.tsx` + `api/client.ts`. 700 unit tests green (9 new on the delta/projection math); admin tsc + build clean; integration tests (summary empty/auth/400, models-seeded) run in CI.

### Phase 12 — ModelGateway v2 shadow routing + model registry (AI-075, slice 1) (2026-06-18)

First RLOps slice: the gateway can now **shadow** a second model against the one serving production, with zero impact on the user. When a feature has an `Ai:Shadow` route configured and the call is sampled, `ModelGateway` — **after** the primary response is ready — fires the same `LlmRequest` at the shadow provider's **untraced `-raw` sibling** (so no `llm_traces` row, no recursion, no double cost-count) as a fire-and-forget background task, then persists one redacted primary-vs-shadow comparison row in **`shadow_runs`** (both responses, latency, cost, tokens, trace ids). Invariants (unit-tested): primary latency/correctness untouched; shadow never threads the caller's cancellation token (own timeout, default 15s); any shadow failure/timeout is swallowed + logged; `StreamAsync` re-yields primary deltas unchanged and shadows once only on **clean** stream completion (suppressed on mid-stream throw). A new **`models`** registry table records which (provider, model) serves each feature by lifecycle `Status` (Primary/Shadow/Retired) — seeded idempotently at startup from the current primary routes (unique natural-key index makes the seed race-safe across replicas; the seeder is guarded so a DB hiccup can't abort API boot). The `models` table is **audit/seed only** in this slice — the gateway still routes by config; table-driven hot-swap + canary/escalate/cost-cap/drift/admin-UI are later slices. **Shadow is OFF by default** (no routes, sample rate 0.0) — no paid background calls until explicitly enabled per feature. New: `ModelGateway` v2, `ShadowOptions`, `IShadowRunWriter`/`DbShadowRunWriter`, `ModelRegistration`/`ShadowRun` entities, `ModelRegistrySeeder`, migration `AddModelRegistryAndShadowRun`. 691 unit tests green.
Expand Down
84 changes: 84 additions & 0 deletions apps/admin/src/api/client.ts
Original file line number Diff line number Diff line change
Expand Up @@ -556,6 +556,66 @@ export interface CrewAbEvalResult {
passed: boolean
cases?: unknown[]
}
// Shadow comparison
export interface ShadowPair {
featureTag: string
primaryModelId: string
shadowModelId: string
runs: number
primaryP50LatencyMs: number
shadowP50LatencyMs: number
latencyDeltaMs: number
primaryCostUsd: number
shadowCostUsd: number
costDeltaUsd: number
projectedMonthlyCostDeltaUsd: number
primaryTokensOut: number
shadowTokensOut: number
tokensOutDelta: number
exactMatchRate: number
avgLengthRatio: number
bothPresentRate: number
firstSeen: string
lastSeen: string
}
export interface ShadowSummary {
from: string
to: string
totalRuns: number
pairs: ShadowPair[]
}
export interface ShadowSample {
id: string
primaryResponse: string | null
shadowResponse: string | null
primaryLatencyMs: number
shadowLatencyMs: number
primaryCostUsd: number
shadowCostUsd: number
primaryTokensOut: number
shadowTokensOut: number
exactMatch: boolean
promptHash: string
primaryTraceId: string | null
shadowTraceId: string | null
createdAt: string
}
export interface ShadowSamplesPage {
total: number
items: ShadowSample[]
}
// Model registry
export interface ModelRegistration {
id: string
featureTag: string
providerKey: string
modelId: string
status: 'Primary' | 'Shadow' | 'Retired'
createdAt: string
}
export interface ModelsRegistry {
models: ModelRegistration[]
}

async function fetchJson<T>(path: string, init?: RequestInit): Promise<T> {
const res = await fetch(`${API_BASE}${path}`, {
Expand Down Expand Up @@ -1198,6 +1258,30 @@ export const adminApi = {
})
},

getShadowSummary: async (params: { from?: string; to?: string; feature?: string }): Promise<ShadowSummary> => {
const query = new URLSearchParams()
if (params.from) query.set('from', params.from)
if (params.to) query.set('to', params.to)
if (params.feature) query.set('feature', params.feature)
const qs = query.toString()
return fetchJson<ShadowSummary>(`/admin/ai-quality/shadow/summary${qs ? `?${qs}` : ''}`)
},

getShadowSamples: async (params: { feature: string; primaryModelId: string; shadowModelId: string; limit?: number; offset?: number }): Promise<ShadowSamplesPage> => {
const query = new URLSearchParams()
query.set('feature', params.feature)
query.set('primaryModelId', params.primaryModelId)
query.set('shadowModelId', params.shadowModelId)
if (params.limit) query.set('limit', String(params.limit))
if (params.offset) query.set('offset', String(params.offset))
const qs = query.toString()
return fetchJson<ShadowSamplesPage>(`/admin/ai-quality/shadow/samples${qs ? `?${qs}` : ''}`)
},

getModels: async (): Promise<ModelsRegistry> => {
return fetchJson<ModelsRegistry>('/admin/ai-quality/models')
},

// Podcasts
generatePodcast: async (editionId: string, lang?: string, force?: boolean): Promise<PodcastStatusDto> => {
return fetchJson<PodcastStatusDto>('/admin/podcasts', {
Expand Down
Loading
Loading