Skip to content

Fetch graph schema live via APOC, cached for the session (1.4.0)#18

Merged
adamjohnwright merged 1 commit intomainfrom
graph-schema-live
Apr 24, 2026
Merged

Fetch graph schema live via APOC, cached for the session (1.4.0)#18
adamjohnwright merged 1 commit intomainfrom
graph-schema-live

Conversation

@adamjohnwright
Copy link
Copy Markdown
Contributor

The curator reported Claude saying the APOC schema was "limited." Root cause: reactome_cypher_schema was calling the sparse built-in db.schema.*, which omits counts, cardinalities, and indexes. When Claude tried apoc.meta.schema() via reactome_cypher_query, the ~500 KB single-row result hit the per-row token cap and returned a truncation stub.

Fix: pull the rich schema inside fetchGraphSchema, format it ourselves, never surface it through the capped query tool.

What changed

  • src/graph/schema.ts: new module. fetchGraphSchema runs apoc.meta.{schema,stats,nodeTypeProperties,relTypeProperties}, dbms.components, db.indexes, db.constraints in parallel and assembles a GraphSchema object. Cached in-memory for the session; concurrent first-callers share one fetch via a pending-promise. 60 s timeout per query (up from the 30 s cypher default) because apoc.meta.schema() samples 3M nodes.
  • src/graph/format-schema.ts: markdown digest with labels sorted by node count, relationship types sorted by cardinality, per-label + per-rel property types with mandatory flags, indexes, constraints. ~80 KB on Reactome (was ~40 KB with db.schema.*).
  • src/index.ts: prefetch on startup so the first tool call is warm.
  • src/tools/cypher.ts + src/resources/static.ts: route to the new module. No functional change beyond what's documented.
  • Split out of src/clients/neo4j.ts so tests can vi.mock runRead across the module boundary.
  • 7 new tests: format coverage + cache/dedup/fallback behaviors.

Verified live against the running Reactome Release96 container:

  • Cold fetch: 21 s on the full graph (this is why we prefetch).
  • Second call: 0 ms, same object reference.
  • Markdown digest: 80,123 chars.
  • Mocked-fallback test confirms graceful degradation when apoc.meta.relTypeProperties / db.indexes / db.constraints are unavailable.

No vendored schema artifact. No coordination with reactome_neo4j_env. One source of truth: the live database.

The curator reported Claude saying the APOC schema was "limited." Root
cause: reactome_cypher_schema was calling the sparse built-in
db.schema.*, which omits counts, cardinalities, and indexes. When
Claude tried apoc.meta.schema() via reactome_cypher_query, the ~500 KB
single-row result hit the per-row token cap and returned a truncation
stub.

Fix: pull the rich schema inside fetchGraphSchema, format it ourselves,
never surface it through the capped query tool.

What changed
- src/graph/schema.ts: new module. fetchGraphSchema runs
  apoc.meta.{schema,stats,nodeTypeProperties,relTypeProperties},
  dbms.components, db.indexes, db.constraints in parallel and
  assembles a GraphSchema object. Cached in-memory for the session;
  concurrent first-callers share one fetch via a pending-promise.
  60 s timeout per query (up from the 30 s cypher default) because
  apoc.meta.schema() samples 3M nodes.
- src/graph/format-schema.ts: markdown digest with labels sorted by
  node count, relationship types sorted by cardinality, per-label +
  per-rel property types with mandatory flags, indexes, constraints.
  ~80 KB on Reactome (was ~40 KB with db.schema.*).
- src/index.ts: prefetch on startup so the first tool call is warm.
- src/tools/cypher.ts + src/resources/static.ts: route to the new
  module. No functional change beyond what's documented.
- Split out of src/clients/neo4j.ts so tests can vi.mock runRead
  across the module boundary.
- 7 new tests: format coverage + cache/dedup/fallback behaviors.

Verified live against the running Reactome Release96 container:
- Cold fetch: 21 s on the full graph (this is why we prefetch).
- Second call: 0 ms, same object reference.
- Markdown digest: 80,123 chars.
- Mocked-fallback test confirms graceful degradation when
  apoc.meta.relTypeProperties / db.indexes / db.constraints are
  unavailable.

No vendored schema artifact. No coordination with reactome_neo4j_env.
One source of truth: the live database.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@adamjohnwright adamjohnwright merged commit ee385df into main Apr 24, 2026
0 of 3 checks passed
@adamjohnwright adamjohnwright deleted the graph-schema-live branch April 24, 2026 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant