Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,16 @@

## [Unreleased]

### Phase 12 — table-driven routing + one-click promote/rollback (AI-077, slice 3) (2026-06-18)

The `models` registry is now a **routing input**, not just an audit log — and an admin can swap which model serves a feature in one click, no redeploy.
- **Table-driven primary routing**: `ModelGateway.Route` resolves a feature's primary **provider key** from the registry (`status=Primary`) before the config route. New `IModelRouteProvider` (Core; impl `RegistryModelRouteProvider` in Application) serves a cached immutable `feature→provider_key` snapshot (TTL `Ai:Routes:CacheSeconds`, default 30s; built in a fresh scope under a double-checked lock; `Invalidate()` drops it). **The gateway can never fail a live call because of the registry** — null/throwing provider, empty registry, or a row whose ProviderKey has no keyed `ILlmService` all fall back: registry → config `Ai:Routes:{tag}` → `Ai:DefaultProvider` → `openai` (nullable keyed resolve + log, never throw). The snapshot build is duplicate-key resilient (oldest wins) even if the one-Primary invariant were ever violated.
- **Promote / rollback** (first mutating Phase-12 endpoints): `POST /admin/ai-quality/models/{id}/promote` (Shadow→Primary, demotes the incumbent to Shadow so it keeps shadowing) and `POST /admin/ai-quality/models/{feature}/rollback` (replays the inverse of the latest promotion). Each is transactional with an append-only **`model_promotions`** audit trail (who/when via `GetAdminUserId()`), invalidates the route cache **only after commit**, and is **strictly Shadow→Primary** (Retired rejected → 400). A DB **partial unique index** `(feature_tag) WHERE status='Primary'` enforces exactly one Primary per feature — concurrent promotes surface as **409**, never two primaries.
- **Candidates**: a shadow run now auto-upserts a promotable `Shadow` `ModelRegistration` for its (feature, provider, model) (guarded on the fire-and-forget path), so enabling a shadow route makes a promote target appear. After a promote, `MaybeShadow` **skips self-shadowing** (resolved shadow key == primary key).
- **Admin UI**: the Models tab is now actionable — Promote on Shadow rows, Rollback on Primary rows (shown only when a Shadow candidate exists), each behind a confirm dialog (these change prod routing live); server error text (409 race / 400 reasons) surfaced without closing the dialog; refetch reflects the new split immediately.

Shadow remains config-driven (`Ai:Shadow:Routes`) this slice — registry Shadow rows are a candidate/audit list, not a shadow-routing input (registry-driven shadow + cost-cap = later slices). Migration `AddModelPromotionAndPrimaryUniqueIndex`. Architect → backend + frontend (parallel) → adversarial QA (verdict SHIP; 2 P1 fixes: snapshot duplicate-resilience + concurrent-promote 409 coverage). 724 unit tests green; admin tsc + build clean; solution clean.

### Phase 12 — admin Shadow + Models tabs (AI-076, slice 2) (2026-06-18)

Makes the slice-1 shadow data visible. Two **read-only** tabs on the admin AI-quality page (`/ai-quality`):
Expand Down
20 changes: 20 additions & 0 deletions apps/admin/src/api/client.ts
Original file line number Diff line number Diff line change
Expand Up @@ -616,6 +616,14 @@ export interface ModelRegistration {
export interface ModelsRegistry {
models: ModelRegistration[]
}
export interface ModelPromotionResult {
featureTag: string
newPrimary: ModelRegistration
demotedToShadow: ModelRegistration | null
action: 'Promote' | 'Rollback'
adminUserId: string | null
createdAt: string
}

async function fetchJson<T>(path: string, init?: RequestInit): Promise<T> {
const res = await fetch(`${API_BASE}${path}`, {
Expand Down Expand Up @@ -1282,6 +1290,18 @@ export const adminApi = {
return fetchJson<ModelsRegistry>('/admin/ai-quality/models')
},

promoteModel: async (id: string): Promise<ModelPromotionResult> => {
return fetchJson<ModelPromotionResult>(`/admin/ai-quality/models/${id}/promote`, {
method: 'POST',
})
},

rollbackModel: async (feature: string): Promise<ModelPromotionResult> => {
return fetchJson<ModelPromotionResult>(`/admin/ai-quality/models/${encodeURIComponent(feature)}/rollback`, {
method: 'POST',
})
},

// Podcasts
generatePodcast: async (editionId: string, lang?: string, force?: boolean): Promise<PodcastStatusDto> => {
return fetchJson<PodcastStatusDto>('/admin/podcasts', {
Expand Down
147 changes: 144 additions & 3 deletions apps/admin/src/pages/AiQualityPage.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ import {
ShadowPair,
ShadowSample,
ModelRegistration,
ModelPromotionResult,
} from '../api/client'

type Tab = 'summary' | 'traces' | 'transcripts' | 'evals' | 'shadow' | 'models'
Expand Down Expand Up @@ -1126,29 +1127,70 @@ function modelStatusColor(status: string): string {
return '#6b7280' // Retired
}

// A pending confirm action: which row + what kind of mutation.
type ModelAction =
| { kind: 'promote'; row: ModelRegistration }
| { kind: 'rollback'; row: ModelRegistration }

function ModelsTab() {
const [models, setModels] = useState<ModelRegistration[]>([])
const [loading, setLoading] = useState(true)
const [error, setError] = useState<string | null>(null)
const [success, setSuccess] = useState<string | null>(null)
const [confirm, setConfirm] = useState<ModelAction | null>(null)
const [dialogError, setDialogError] = useState<string | null>(null)
const [pending, setPending] = useState(false)

useEffect(() => {
const load = () =>
adminApi
.getModels()
.then((d) => {
setModels(d.models)
setError(null)
})
.catch((e) => setError(e instanceof Error ? e.message : 'Failed to load'))
.finally(() => setLoading(false))

useEffect(() => {
load().finally(() => setLoading(false))
}, [])

// Features that currently have at least one Shadow row — rollback needs a prior promotion.
const featuresWithShadow = new Set(models.filter((m) => m.status === 'Shadow').map((m) => m.featureTag))

const runAction = async () => {
if (!confirm) return
setPending(true)
setDialogError(null)
try {
let res: ModelPromotionResult
if (confirm.kind === 'promote') {
res = await adminApi.promoteModel(confirm.row.id)
} else {
res = await adminApi.rollbackModel(confirm.row.featureTag)
}
await load()
setConfirm(null)
setSuccess(
`${res.action === 'Rollback' ? 'Rolled back' : 'Promoted'} ${res.featureTag} → primary ${res.newPrimary.modelId}` +
(res.demotedToShadow ? ` (${res.demotedToShadow.modelId} now shadow)` : ''),
)
} catch (e) {
// Surface the server message (409 race / 400 reasons) in the dialog; keep it open.
setDialogError(e instanceof Error ? e.message : 'Action failed')
} finally {
setPending(false)
}
}

return (
<>
<p className="dashboard-page__subtitle" style={{ margin: '0 0 12px' }}>
Registered models per feature and their routing status. Read-only.
Registered models per feature and their routing status. Promote a shadow to primary or roll a feature back to its
previous primary.
</p>

{error && <Banner text={error} />}
{success && <SuccessBanner text={success} onClose={() => setSuccess(null)} />}

{loading ? (
<p className="dashboard-page__subtitle">Loading…</p>
Expand All @@ -1163,6 +1205,7 @@ function ModelsTab() {
<th style={th}>Model</th>
<th style={th}>Status</th>
<th style={th}>Created</th>
<th style={th}>Actions</th>
</tr>
</thead>
<tbody>
Expand All @@ -1173,15 +1216,97 @@ function ModelsTab() {
<td style={td}>{m.modelId}</td>
<td style={{ ...td, fontWeight: 600, color: modelStatusColor(m.status) }}>{m.status}</td>
<td style={td}>{timeAgo(m.createdAt)}</td>
<td style={td}>
{m.status === 'Shadow' && (
<button
onClick={() => {
setDialogError(null)
setConfirm({ kind: 'promote', row: m })
}}
disabled={pending}
style={rangeBtn(false)}
>
Promote
</button>
)}
{m.status === 'Primary' && featuresWithShadow.has(m.featureTag) && (
<button
onClick={() => {
setDialogError(null)
setConfirm({ kind: 'rollback', row: m })
}}
disabled={pending}
style={rangeBtn(false)}
>
Rollback
</button>
)}
</td>
</tr>
))}
</tbody>
</table>
)}

{confirm && (
<ModelConfirmDialog
action={confirm}
pending={pending}
error={dialogError}
onConfirm={runAction}
onClose={() => {
if (!pending) setConfirm(null)
}}
/>
)}
</>
)
}

function ModelConfirmDialog({
action,
pending,
error,
onConfirm,
onClose,
}: {
action: ModelAction
pending: boolean
error: string | null
onConfirm: () => void
onClose: () => void
}) {
const { row } = action
const title = action.kind === 'promote' ? 'Promote to primary' : 'Roll back primary'
const message =
action.kind === 'promote'
? `Promote ${row.modelId} to primary for ${row.featureTag}? The current primary will keep running as shadow.`
: `Roll back ${row.featureTag} to its previous primary?`
return (
<div onClick={onClose} style={overlay}>
<div onClick={(e) => e.stopPropagation()} style={{ ...modal, maxWidth: 460 }}>
<div style={{ display: 'flex', justifyContent: 'space-between', alignItems: 'center', marginBottom: 12 }}>
<h2 style={{ margin: 0, fontSize: 18 }}>{title}</h2>
<button onClick={onClose} disabled={pending} style={{ ...rangeBtn(false), border: 'none' }}>
</button>
</div>
<p style={{ fontSize: 14, color: '#374151', margin: '0 0 4px' }}>{message}</p>
<p style={{ fontSize: 12, color: '#9ca3af', margin: '0 0 8px' }}>This changes production routing immediately.</p>
{error && <Banner text={error} />}
<div style={{ display: 'flex', gap: 8, justifyContent: 'flex-end', marginTop: 12 }}>
<button onClick={onClose} disabled={pending} style={rangeBtn(false)}>
Cancel
</button>
<button onClick={onConfirm} disabled={pending} style={rangeBtn(true)}>
{pending ? 'Working…' : action.kind === 'promote' ? 'Promote' : 'Roll back'}
</button>
</div>
</div>
</div>
)
}

// ─────────────────────────── shared ───────────────────────────

function Pager({
Expand Down Expand Up @@ -1218,6 +1343,22 @@ function Banner({ text }: { text: string }) {
return <div style={{ background: '#fef2f2', color: '#b91c1c', padding: 12, borderRadius: 8, margin: '12px 0' }}>{text}</div>
}

function SuccessBanner({ text, onClose }: { text: string; onClose: () => void }) {
return (
<div
style={{
background: '#ecfdf5', color: '#065f46', padding: 12, borderRadius: 8, margin: '12px 0',
display: 'flex', justifyContent: 'space-between', alignItems: 'center', gap: 12,
}}
>
<span>{text}</span>
<button onClick={onClose} style={{ border: 'none', background: 'none', color: '#065f46', cursor: 'pointer', fontSize: 14 }}>
</button>
</div>
)
}

function formatBreakdown(json: string | null): string {
if (!json) return '—'
try {
Expand Down
20 changes: 20 additions & 0 deletions backend/src/Ai/TextStack.Ai.Core/IModelRouteProvider.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
namespace TextStack.Ai.Core;

/// <summary>
/// Resolves the PRIMARY provider key for a feature tag from the <c>models</c> registry
/// (the table-driven source of truth, set by admin promote/rollback). The
/// ModelGateway consults this BEFORE its config fallback, so a promotion takes effect
/// without a redeploy. Hot path: the implementation MUST cache and MUST NEVER throw —
/// any failure (no DB, empty registry, race) returns <c>null</c> so the gateway falls
/// straight through to its config route. Implemented in Application (EF kept out of Llm).
/// </summary>
public interface IModelRouteProvider
{
/// <summary>Primary provider key for the feature, or null when none is registered
/// or on ANY failure (never throws). Cached for a short TTL.</summary>
string? PrimaryProviderKey(string featureTag);

/// <summary>Drop the cached snapshot so the next read rebuilds from the registry
/// (called after a promote/rollback writes a new Primary).</summary>
void Invalidate();
}
5 changes: 4 additions & 1 deletion backend/src/Ai/TextStack.Ai.Core/IShadowRunWriter.cs
Original file line number Diff line number Diff line change
Expand Up @@ -32,4 +32,7 @@ public record ShadowRun(
Guid? ShadowTraceId,
string PromptHash,
Guid? UserId,
DateTimeOffset CreatedAt);
DateTimeOffset CreatedAt,
// Resolved shadow PROVIDER key (e.g. "openai-explain"), threaded from the gateway so
// the writer can upsert a candidate Shadow ModelRegistration (promotable later).
string ShadowProviderKey = "");
Loading
Loading