feat(ai): hybrid (semantic) catalog search — semantic=true (AI-057)#361
Open
mrviduus wants to merge 2 commits into
Open
feat(ai): hybrid (semantic) catalog search — semantic=true (AI-057)#361mrviduus wants to merge 2 commits into
mrviduus wants to merge 2 commits into
Conversation
Phase 9. GET /search?q=&semantic=true blends keyword (FTS) + vector (query-embedding vs editions.embedding cosine) edition rankings via RRF, same PaginatedResult<SearchResultDto> shape (frontend-transparent). A semantically-related-but-keyword-absent book surfaces — the payoff. - Orchestrator HybridCatalogSearch (Application, not the FTS provider which has no AI deps): wide FTS pool (offset 0) + IEmbeddingService.EmbedAsync + editions-cosine rank (AI-055 visibility: site, status=1, embedding NOT NULL, lang, EXISTS(chapters); cosine <=> via HNSW, param vector) + RrfFusion.Fuse on edition_id + paginate the FUSED order. Vector-only hits get title/author/cover + first-chapter fallback + empty highlights. - semantic absent/false → today's pure-FTS path byte-for-byte unchanged, no embed, no cost; search-semantic rate limit (20/min) applies ONLY when semantic=true (pure-FTS unthrottled). - Graceful FTS fallback (QA P2): embed/vector-rank failure → log + verbatim searchProvider.SearchAsync (semantic search never hard-fails catalog search); OperationCanceledException propagates. 654 unit (fusion granularity, toggle-off passthrough, embed-guard, empty- vector degradation, embed-failure fallback, cancellation) + integration (pgvector, gated): keyword-absent semantic hit surfaces, draft/hidden/ other-site/other-lang never appear, pure-FTS control no drift. StudyBuddy set-equality green; docker-compose clean. Frontend toggle UI = later. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
# Conflicts: # CHANGELOG.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
AI-057 — hybrid (semantic) catalog search (Phase 9)
GET /search?q=&semantic=trueblends keyword (FTS) + vector (query-embedding vseditions.embeddingcosine) edition rankings via RRF — a semantically-related-but-keyword-absent book surfaces. SamePaginatedResult<SearchResultDto>shape (frontend-transparent).HybridCatalogSearch(Application): wide FTS pool +EmbedAsync+ editions-cosine rank (AI-055 visibility, param vector → HNSW) +RrfFusion.Fuseonedition_id+ paginate the fused order; vector-only hits get card fields + empty highlights.semanticabsent/false → pure-FTS path byte-for-byte unchanged (no embed/cost);search-semanticrate limit (20/min) only when semantic=true.654 unit + integration (pgvector, gated). StudyBuddy set-equality green; docker-compose clean.
status=1Published. Frontend toggle = later.🤖 Generated with Claude Code