feat: replay-trace tokens + /v1/replay endpoint (Phase 3.1) by hallelx2 · Pull Request #19 · hallelx2/vectorless-engine

hallelx2 · 2026-05-27T01:44:11Z

Summary

Phase 3.1 of the engine plan: every retrieval response now carries a deterministic trace_token, and a new POST /v1/replay endpoint returns the byte-identical response when given that token plus the original query + document_id. This turns the whitepaper's "every answer is reproducible" claim into a working surface.

Design

Trace token shape

trace_token = sha256(doc_id | doc_version | model | system_prompt_version | sorted(selected_ids).joined("\0")), hex-encoded lowercase. 64 chars.

Lexicographic sort of IDs makes the token order-invariant: two strategies that pick the same set produce the same token regardless of reasoning path.
NUL separator inside the hash input avoids pathological-ID collision ("a,b" + "c" vs "a" + "b,c").
doc_version is a parameter so Phase 3.2's per-document versioning is a one-line change; today every call passes "1".
SystemPromptVersion ("v1") is a build-time constant. A test pins it so bumping it is a deliberate, replay-invalidating decision.

Byte-exact replay

The chain:

handleQuery / handleAnswer build a map[string]any response.
marshalJSONForReplay calls json.Marshal (which sorts map keys lexicographically) and appends a trailing newline to match the pre-3.1 json.Encoder.Encode wire format. Same Go value → same []byte, always.
writeJSONWithReplay writes those exact bytes to the wire AND hands them to the replay store in lock-step.
handleReplay returns entry.ResponseJSON verbatim. No re-marshalling, no normalisation.

Replay validation

Missing/empty body fields → 400.
Unknown trace_token → 404.
document_id mismatch → 409 with details: "document_id differs from original".
query mismatch → 409 with details: "query differs from original".
Server disabled (retrieval.replay.enabled=false or Deps.Replay == nil) → 501.

document_id is checked first because it's the highest-cardinality identifier and surfaces the most useful "you're pointing at the wrong document" signal.

Config

retrieval.replay block under RetrievalConfig:

retrieval:
  replay:
    enabled: true       # opt-out by design — replay is a moat
    max_entries: 1024
    ttl_seconds: 86400  # 24h

Env: VLE_RETRIEVAL_REPLAY_{ENABLED,MAX_ENTRIES,TTL_SECONDS}.

Opt-out

Set retrieval.replay.enabled=false (or VLE_RETRIEVAL_REPLAY_ENABLED=false). Deps.Replay becomes nil; handlers skip the per-response Put and /v1/replay returns 501. The trace_token field on the response is still computed (it's free) but no replay entry is stored.

Test plan

pkg/retrieval/trace_test.go — ComputeTraceToken: hex shape, determinism, sort invariance, no-mutation, input sensitivity (per component), delimiter collision avoidance, empty selection, SystemPromptVersion pinning.
pkg/retrieval/replay_test.go — LRUReplayStore: basic Put/Get, miss, empty-token rejection, LRU eviction at capacity, TTL expiry, in-place update, byte-exactness on tricky payloads (unicode + whitespace), default-TTL non-zero, parallel hammer for race detection.
pkg/retrieval/retrieval_test.go — extensions: SinglePass and ChunkedTree stamp 64-char hex TraceToken; strategy token matches ComputeTraceToken externally.
pkg/config/config_test.go — defaults (replay enabled, 1024, 86400), env overrides (enable, disable, max, ttl), bad-env rejection, negative-value validation.
internal/api/replay_test.go — byte-exact replay, unknown token 404, both mismatch flavours 409, disabled-store 501, required-fields 400, malformed JSON 400, unicode/whitespace preservation, end-to-end byte-exactness through marshalJSONForReplay → store → handler.
All pre-existing tests pass (go test ./...).
go build ./... and go vet ./... clean.

Known limitation

The replay store is in-memory only — not durable across process restarts. This is the documented v1 limitation; Phase 3.2 will swap LRUReplayStore for a persistent store + per-document versioning behind the same retrieval.ReplayStore interface (so handlers don't change).

Every CostStrategy result now carries a deterministic trace token — sha256(doc_id | doc_version | model | system_prompt_version | sorted(selected_ids)) hex-encoded. Same inputs always produce the same 64-char hex string, regardless of reasoning path; permuted ID order is invariant. ComputeTraceToken is the canonical helper. SinglePass, ChunkedTree, and AgenticStrategy each call it before returning. The Cached wrapper re-derives the token on cache hits so the trace survives the cache layer (the token is a pure function of cached inputs). SystemPromptVersion ("v1") is bumped whenever a retrieval system prompt changes in a way that should invalidate replay; the constant is asserted in tests so the bump is a deliberate decision. A future phase will replace the placeholder doc_version "1" with real per-document versioning — the parameter is in the signature already so that's a one-line change.

LRUReplayStore is a thin facade over pkg/cache.LRU that maps trace tokens to ReplayEntry values (DocumentID + Query + Model + SelectedIDs + raw ResponseJSON bytes + CreatedAt). The store is safe for concurrent Put/Get, bounds itself by MaxEntries (default 1024) and expires entries past TTL (default 24h). ReplayEntry.ResponseJSON is the literal bytes of the original response — replay returns these verbatim so the byte-exact guarantee holds regardless of how the response is constructed. Go's encoding/json already sorts map keys lexicographically, but storing raw bytes removes any future doubt about determinism. retrieval.replay config block ships with Enabled=true: replay is the moat versus stateless vector RAG and should be on by default. Operators can opt out via retrieval.replay.enabled=false or VLE_RETRIEVAL_REPLAY_ENABLED=false. Capacity / TTL tune via VLE_RETRIEVAL_REPLAY_MAX_ENTRIES and VLE_RETRIEVAL_REPLAY_TTL_SECONDS. Tests cover Put/Get, miss, empty-token safety, LRU eviction at capacity, TTL expiry, in-place update, byte-exactness on tricky payloads (whitespace + unicode), default-TTL non-zero, and a parallel hammer that surfaces races under go test -race. Not durable across process restarts — Phase 3.2 will replace this with persistent storage + per-document versioning. The interface abstraction (ReplayStore) lets that swap happen without touching handlers.

handleQuery and handleAnswer now stamp a deterministic trace_token into the response body. The exact bytes sent on the wire are also stored in retrieval.ReplayStore under that token. POST /v1/replay with {trace_token, query, document_id} returns those bytes verbatim — same wire bytes, same Content-Type, same trailing newline. The byte-exactness chain: 1. The response map is marshalled once with json.Marshal. Go's encoding/json sorts map[string]any keys lexicographically, so the same Go value always produces the same []byte. 2. marshalJSONForReplay appends the same trailing newline that json.Encoder.Encode would, so the wire format is unchanged from the pre-3.1 behaviour and existing clients see no diff. 3. writeJSONWithReplay writes those bytes to the response AND hands them to the replay store in lock-step — a single []byte, two writes. 4. The replay handler returns store.Get(token).ResponseJSON verbatim. No re-marshalling, no normalisation. Replay validation: missing/empty fields → 400, unknown token → 404, mismatched document_id → 409 with details=document_id differs, mismatched query → 409 with details=query differs. The order matters: document_id is checked first because it's the highest-cardinality identifier and surfaces the most useful "you're pointing at the wrong document" signal. cmd/engine/main.go wires LRUReplayStore when retrieval.replay.enabled is true (the default). When disabled, Deps.Replay is nil — handlers skip the per-response Put and /v1/replay returns 501. OpenAPI adds: - trace_token field on QueryResponse + AnswerResponse - ReplayRequest schema (all three fields required) - /v1/replay path with 200/400/404/409/501 documented - 200's body is oneOf [QueryResponse, AnswerResponse] so the spec encodes the actual replayed shape config.example.yaml gets the retrieval.replay block with inline guidance (opt-out semantics, in-memory v1 caveats, forward pointer to Phase 3.2). The internal/api/replay_test.go suite covers byte-exact replay, unknown token, both mismatch flavours, disabled-store 501, required-fields 400, malformed JSON 400, unicode/whitespace preservation, and an end-to-end test that re-marshals the same map twice and asserts encoding/json is deterministic over the shape the engine actually emits.

sourcery-ai

Sorry @hallelx2, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

coderabbitai · 2026-05-27T01:44:19Z

Warning

Review limit reached

@hallelx2, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 16 minutes and 55 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c3840c47-3eb4-4370-bc9c-bbad59304445

📥 Commits

Reviewing files that changed from the base of the PR and between 55dd9c1 and 7f74748.

📒 Files selected for processing (17)

cmd/engine/main.go
config.example.yaml
internal/api/replay_test.go
internal/api/server.go
openapi.yaml
pkg/config/config.go
pkg/config/config_test.go
pkg/retrieval/agentic.go
pkg/retrieval/cached.go
pkg/retrieval/chunked_tree.go
pkg/retrieval/replay.go
pkg/retrieval/replay_test.go
pkg/retrieval/retrieval_test.go
pkg/retrieval/single_pass.go
pkg/retrieval/strategy.go
pkg/retrieval/trace.go
pkg/retrieval/trace_test.go

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/replay-trace-tokens

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

hallelx2 added 3 commits May 27, 2026 02:30

Copilot AI review requested due to automatic review settings May 27, 2026 01:44

sourcery-ai Bot reviewed May 27, 2026

View reviewed changes

Copilot started reviewing on behalf of hallelx2 May 27, 2026 01:44 View session

hallelx2 merged commit 75efc0c into main May 27, 2026
6 of 9 checks passed

hallelx2 deleted the feat/replay-trace-tokens branch May 27, 2026 01:45

hallelx2 review requested due to automatic review settings May 27, 2026 02:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: replay-trace tokens + /v1/replay endpoint (Phase 3.1)#19

feat: replay-trace tokens + /v1/replay endpoint (Phase 3.1)#19
hallelx2 merged 3 commits into
mainfrom
feat/replay-trace-tokens

hallelx2 commented May 27, 2026

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

coderabbitai Bot commented May 27, 2026

Review limit reached

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hallelx2 commented May 27, 2026

Summary

Design

Trace token shape

Byte-exact replay

Replay validation

Config

Opt-out

Test plan

Known limitation

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented May 27, 2026

Review limit reached

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant