perf(sdk): batch edge resolution in the investigate/graph hot path#25
perf(sdk): batch edge resolution in the investigate/graph hot path#25vigneshnarayanaswamy wants to merge 1 commit into
Conversation
investigate() and the graph methods (dependencies/members/groups)
resolved every dependency and membership edge with its own single-model
get() round trip, so the backend call count grew linearly with edge
count — ~29 single-model lookups for a node with a handful of edges,
~93 for a richly-connected one.
Resolve all edges of a node in one batched lookup instead:
- Add an optional `get_models(hashes) -> {hash: ModelRef}` bulk method to
every backend (in-memory, sqlite, snowflake, json-files) plus a
protocol-only `batch_fallbacks.get_models` for third-party backends.
It is hasattr-dispatched, so the LedgerBackend protocol surface is
unchanged and existing backends keep working.
- dependencies()/members()/groups() now collect edges first, then resolve
every target hash in a single get_models call. They also accept an
optional pre-fetched `snapshots` list so callers thread an already-loaded
history through instead of refetching.
- groups() scans candidate composites' membership in one list_all_snapshots
pass (when supported) rather than one list_snapshots per candidate.
- sqlite batch_dependencies and the batch_fallbacks.batch_dependencies
resolve edge targets with one bulk lookup instead of per-edge get_model
(snowflake already did this).
- investigate() reuses the model history it already fetched for groups()
and members() (only when no as_of filter is in play, to preserve their
current-state semantics).
Resolution semantics are unchanged: hash-first with a per-edge name
fallback that fires only when a hash does not resolve. On sqlite an
investigate of a handful-of-edges node drops from ~29 to ~9 total backend
round trips, and a richly-connected one from ~93 to ~11 — and the count no
longer scales with edge count.
Tests: a counting fake backend proves per-edge single lookups stay at zero
and the round-trip budget is flat across graph sizes; parity tests confirm
identical dependency/group/member results and that member-removal replay
still excludes removed members. New cross-backend tests pin the get_models
contract.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c217be4856
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ) | ||
| ) | ||
|
|
||
| by_hash = get_models(backend, [h for _, h, _, _ in edges if h]) |
There was a problem hiding this comment.
Use backend get_models in fallback dependencies
When investigate() runs against a backend that has the new get_models method but no batch_dependencies implementation, such as JsonFileLedgerBackend, this fallback still calls the module-level get_models, which loops over backend.get_model once per distinct edge. For json-files that means a full models-directory scan per edge, so the dependency half of the hot path remains O(edges × files) despite the new bulk resolver; dispatch to backend.get_models here when it exists.
Useful? React with 👍 / 👎.
Problem
investigate()and the graph methods (dependencies/members/groups) resolved every dependency and membership edge with its own single-modelget()round trip, so the backend call count grew linearly with edge count. Measured baseline (sqlite, counting backend):Root cause:
dependencies()resolved each edge with its ownLedger.get(),members()resolved each membership event with its ownget(), andgroups()re-fetched each candidate composite's full membership onelist_snapshotsat a time. Within oneinvestigatecall the same history was fetched repeatedly.This is the documented follow-up from #21.
Fix
Resolve all edges of a node in one batched lookup:
get_models(hashes) -> {hash: ModelRef}bulk method to every backend (in-memory, sqlite, snowflake, json-files) plus a protocol-onlybatch_fallbacks.get_modelsfor third-party backends. It ishasattr-dispatched, so theLedgerBackendprotocol surface is unchanged and every existing/third-party backend keeps working.dependencies()/members()/groups()collect edges first, then resolve every target hash in a singleget_modelscall. They also accept an optional pre-fetchedsnapshotslist so callers thread an already-loaded history through instead of refetching.groups()scans candidate composites' membership in onelist_all_snapshotspass (when the backend supports it) rather than onelist_snapshotsper candidate; falls back cleanly otherwise.batch_dependenciesandbatch_fallbacks.batch_dependenciesnow resolve edge targets with one bulk lookup instead of per-edgeget_model(snowflake already batched this).investigate()reuses the model history it already fetched forgroups()/members()— only when noas_offilter is in play, to preserve their current-state semantics.Resolution semantics are unchanged: hash-first with a per-edge name fallback that fires only when a hash does not resolve.
Round trips: before → after (sqlite)
The remaining round trips are fixed-cost (initial name resolve, one history read, one
batch_dependencies, one membership scan, a few batchedget_models) and no longer scale with edge count.Tests
get_modelscontract (resolves all, omits missing, dedups, blank/empty input, parity with singleget_model) across in-memory, sqlite, and json-files plus the fallback.pytest(775 passed, 24 skipped),ruff check,ruff format, andmypyare all green.🤖 Generated with Claude Code