Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,18 @@

## [Unreleased]

### Phase 9 — hybrid (semantic) catalog search (AI-057) (2026-06-17)

Third Phase 9 slice: `GET /search?q=&semantic=true` blends the existing keyword (FTS) edition ranking with a **vector** ranking (query embedding vs the AI-054 `editions.embedding`) via **RRF**, returning the SAME `PaginatedResult<SearchResultDto>` shape (frontend-transparent). **Default OFF**: `semantic` absent/false → today's pure-FTS path, **byte-for-byte unchanged, zero new cost/latency**. Backend only (toggle UI is out of scope). Eval (precision@k) is a later step.

- **Orchestrator** (`backend/src/Application/Search/HybridCatalogSearch.cs`) — lives in **Application**, NOT the FTS provider (which has no AI deps). `SearchAsync(request, language, ct)`: (a) pulls a WIDER FTS candidate pool from the provider (offset 0, `limit ≈ clamp(offset+limit, 30, 200)`, highlights ON) keyed by `edition_id`; (b) `IEmbeddingService.EmbedAsync(q)` for the query vector (one OpenAI embedding call per semantic search); (c) runs the editions-cosine SQL for the vector edition-id pool; (d) `RrfFusion.Fuse([ftsIds, vectorIds])` at **edition granularity** (the shared fusion key — both retrievers already collapse to one row per edition); (e) **paginates the FUSED order** with the request's offset/limit (fixing the FTS-internal-pagination-vs-fusion skew); (f) materializes page DTOs — reuses the FTS hit's `SearchResultDto` (with its best-chapter snippet) where present, and for **vector-only editions** (no keyword match) fetches title/author/cover + a first-chapter (`chapter_number`-ordered) `ChapterId/Slug/Title` fallback with **empty `Highlights`** (no snippet exists). Application already references `Ai.Core`/`Ai.Rag`, so `RrfFusion`, `IEmbeddingService`, and `RagService.FormatVector` are **reused directly** (no new project dep, no copied RRF).
- **Vector SQL** (mirrors AI-055 visibility exactly): `SELECT e.id FROM editions e WHERE e.site_id = @siteId AND e.status = 1 AND e.embedding IS NOT NULL AND (@lang IS NULL OR e.language = @lang) AND EXISTS (SELECT 1 FROM chapters c WHERE c.edition_id = e.id) ORDER BY e.embedding <=> CAST(@qvec AS vector) LIMIT @pool;` — `@qvec` = `FormatVector(queryVector)`, parameterized so the HNSW `vector_cosine_ops` index serves the ORDER BY; `<=>` cosine (the stored mean is un-normalized → cosine mandatory, NOT L2); 5s command timeout; `status = 1` = `EditionStatus.Published` ordinal.
- **Toggle** (`backend/src/Api/Endpoints/SearchEndpoints.cs`). Added `[FromQuery] bool? semantic`. The existing ≥2-char / non-empty / ≤200-char `q` guards run FIRST, so a short/empty query with `semantic=true` returns today's 400 **with no embed call**. `semantic == true` + valid `q` → `HybridCatalogSearch.SearchAsync`; otherwise → the verbatim `searchProvider.SearchAsync` path. `TotalCount` is **approximate** (distinct fused-candidate-pool size) — no extra exact-count scan in v1.
- **Rate limit.** New `search-semantic` policy (`backend/src/Api/Program.cs`, 20/min per IP, cloned from `explain`/`translate`) applied to `GET /search`. CRITICAL: the policy is a **NO-OP (`GetNoLimiter`) unless `?semantic` is truthy** — the pure-FTS path consumes no partition and stays completely unthrottled (zero new cost/latency).
- **Graceful FTS fallback (P2 fix)** (`backend/src/Application/Search/HybridCatalogSearch.cs`). The semantic step (the external `IEmbeddingService.EmbedAsync` call + the vector-rank SQL) is wrapped in a single `try/catch (Exception ex) when (ex is not OperationCanceledException)`: on ANY failure (OpenAI down/throttled/timeout, vector-query error) the orchestrator logs a warning (`ILogger<HybridCatalogSearch>`) and returns the **verbatim pure-FTS** result by re-issuing `searchProvider.SearchAsync(request, ct)` — so a semantic search that can't reach the embedder degrades to a keyword search (byte-identical shape: correct `TotalCount` + pagination) instead of hard-500ing the whole catalog. **Semantic search never takes down catalog search.** `OperationCanceledException` is explicitly NOT swallowed — genuine request cancellation propagates. (The empty-vector "no editions embedded yet" case was already handled by RRF; this guards only the embed/vector-query THROW path.)
- **No migration** — the `editions.embedding` column + HNSW index already exist from AI-054.
- **Tests.** Integration (`tests/TextStack.IntegrationTests/HybridCatalogSearchTests.cs`, real Postgres+pgvector, `TEST_DB_CONNECTION`-gated, self-contained seed + cleanup, mirrors the AI-055 harness; **mocks `IEmbeddingService`** to return a fixed query vector — no real OpenAI): seeds A (keyword-matches `q`, orthogonal embedding), B (keyword-ABSENT, embedding colinear with the fixed query vector), and draft/hidden/other-site/other-lang near-editions. `semantic=true` asserts **B surfaces** (the keyword-absent semantic payoff), A present, the invisible editions **never** appear, and B's hit carries **empty highlights + a first-chapter fallback**; a control asserts the pure-FTS path returns A unaffected (no drift). Unit (`tests/TextStack.UnitTests/HybridCatalogSearchTests.cs`): edition-id-granularity RRF (an edition in BOTH lists outranks a single-list edition; a vector-only edition still ranks), the ≥2-char guard predicate (no embed), and the **P2 fallback** — a fake `IEmbeddingService` that THROWS makes `SearchAsync` return the stub `ISearchProvider`'s FTS result (no exception propagates, DB never touched), while a fake embedder throwing `OperationCanceledException` PROPAGATES (cancellation not swallowed). `dotnet build` + UnitTests (654, StudyBuddy set-equality green) + the 2 integration tests (ran against a disposable `pgvector/pgvector:pg16` migrated via `dotnet ef database update`, then removed — AI-055's 5 integration tests re-run green as a regression check; docker-compose left untouched) + `dotnet format --verify-no-changes` all green; no new `ITool`.

### Phase 9 — "Similar books" rail on BookDetailPage (AI-056) (2026-06-17)

The first user-visible Phase 9 surface. A `SimilarBooksRail` on the web `BookDetailPage` renders books most similar to the one being viewed, via the AI-055 endpoint `GET /books/{slug}/similar?limit=8` (cosine over `editions.embedding`). `getSimilarBooks(slug, limit)` added to the api client (mirrors the language-prefixed `/books/{slug}/...` pattern), wired through `useApi()`. The rail reuses the existing "more by author" book-card markup/CSS (cover + `stringToColor` first-letter fallback, `LocalizedLink` to `/books/{slug}`) — no new design. **Renders nothing (returns null) on an empty list OR a fetch error** — a book with no embedding (or no neighbors) simply shows no rail, never an error/skeleton; client-side fetch, SSG-safe. 3 Vitest cases (renders cards, hides on empty, hides on error); web suite 520 green; tsc + build clean. Note: existing prod editions have NULL embedding until the owner runs the AI-054 `backfill-edition-embeddings` CLI — the rail hides gracefully until then.
Expand Down
15 changes: 12 additions & 3 deletions backend/src/Api/Endpoints/SearchEndpoints.cs
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@

using Api.Language;
using Api.Sites;
using Application.Search;
using Contracts.Common;
using Microsoft.AspNetCore.Mvc;
using TextStack.Search.Abstractions;
Expand All @@ -29,8 +30,9 @@ public static void MapSearchEndpoints(this WebApplication app)
// Group endpoints under /search prefix with OpenAPI tag
var group = app.MapGroup("/search").WithTags("Search");

// Two endpoints: full-text search and autocomplete suggestions
group.MapGet("", Search).WithName("Search");
// Two endpoints: full-text search and autocomplete suggestions.
// search-semantic limiter is a NO-OP unless ?semantic=true (AI-057) — pure-FTS stays unthrottled.
group.MapGet("", Search).WithName("Search").RequireRateLimiting("search-semantic");
group.MapGet("/suggest", Suggest).WithName("SearchSuggest");
}

Expand All @@ -41,10 +43,12 @@ public static void MapSearchEndpoints(this WebApplication app)
private static async Task<IResult> Search(
HttpContext httpContext,
ISearchProvider searchProvider, // Injected via DI
HybridCatalogSearch hybridSearch, // AI-057: resolved always, invoked only when semantic=true
[FromQuery] string q, // Search query
[FromQuery] int? limit, // Page size (default 20, max 100)
[FromQuery] int? offset, // Skip N results
[FromQuery] bool? highlight, // Include text snippets?
[FromQuery] bool? semantic, // AI-057: blend FTS + vector via RRF? (default OFF)
CancellationToken ct)
{
// ─── Input Validation ───────────────────────────────────
Expand Down Expand Up @@ -77,7 +81,12 @@ private static async Task<IResult> Search(
highlight ?? false);

// ─── Execute Search ─────────────────────────────────────
var result = await searchProvider.SearchAsync(request, ct);
// AI-057: semantic=true blends FTS + editions.embedding cosine via RRF (same DTO shape).
// The ≥2-char/non-empty guard above already ran, so the embed call is never wasted on a
// short query. semantic absent/false → today's pure-FTS path, byte-for-byte unchanged.
var result = semantic == true
? await hybridSearch.SearchAsync(request, language, ct)
: await searchProvider.SearchAsync(request, ct);

// ─── Map to Response ────────────────────────────────────
// Transform internal SearchHit to API DTO
Expand Down
29 changes: 29 additions & 0 deletions backend/src/Api/Program.cs
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,15 @@
builder.Services.AddScoped(_ =>
new Application.Recommendations.SimilarBooksService(() => new NpgsqlConnection(connectionString)));

// Hybrid catalog search (AI-057): blends the FTS edition ranking with cosine NN over
// editions.embedding via RRF. Only invoked on `semantic=true`; the pure-FTS path never touches it.
builder.Services.AddScoped(sp =>
new Application.Search.HybridCatalogSearch(
sp.GetRequiredService<TextStack.Search.Abstractions.ISearchProvider>(),
sp.GetRequiredService<global::TextStack.Ai.Core.IEmbeddingService>(),
() => new NpgsqlConnection(connectionString),
sp.GetRequiredService<ILogger<Application.Search.HybridCatalogSearch>>()));

// Reindex service (used by CLI)
builder.Services.AddScoped<SearchReindexService>();

Expand Down Expand Up @@ -352,6 +361,26 @@
QueueLimit = 0,
});
});
// Hybrid catalog search (AI-057): semantic=true embeds the query (one paid OpenAI embedding
// call per request) before the $0 pgvector scan, so it gets its own per-IP throttle. CRITICAL:
// this policy is a NO-OP unless `semantic` is truthy — the pure-FTS path (semantic absent/false)
// consumes no partition and stays completely unthrottled (zero new cost/latency).
options.AddPolicy("search-semantic", httpContext =>
{
var semantic = httpContext.Request.Query["semantic"].ToString();
var isSemantic = semantic.Equals("true", StringComparison.OrdinalIgnoreCase)
|| semantic == "1";
if (!isSemantic)
return RateLimitPartition.GetNoLimiter("search-fts");

var ip = httpContext.Connection.RemoteIpAddress?.ToString() ?? "unknown";
return RateLimitPartition.GetFixedWindowLimiter("semantic:" + ip, _ => new FixedWindowRateLimiterOptions
{
Window = TimeSpan.FromMinutes(1),
PermitLimit = 20,
QueueLimit = 0,
});
});
// "Ask this book" (RAG) — one LLM call per request, per-user reading. 30/min per IP is
// generous for genuine use and caps scripted abuse.
options.AddPolicy("rag.ask", httpContext =>
Expand Down
Loading
Loading