Skip to content

Commit 13076a6

Browse files
committed
feat(java-neo4j): reconstruct variable_initializers as {}; note 2.4.1 fixes the emitter gaps
- reconstruct: a field with no initializers rehydrates to {} (not None), matching the analyzer's analysis.json representation. - Verified codeanalyzer-java 2.4.1 fixes the three projection gaps (#156/#157/#158): rebuilt the 2.4.1 jar and re-ran the daytrader parity — fields no longer collapse (642 JField nodes), imports link to :JType (1449), and J_CALLS went 287 -> 1702 (97% parity; the residual is external-target gating + run-to-run WALA variance between the separate --emit json and --emit neo4j invocations). Docstring/CHANGELOG updated; the SDK still pins 2.4.0 until 2.4.1 is released.
1 parent 007e56c commit 13076a6

3 files changed

Lines changed: 25 additions & 18 deletions

File tree

CHANGELOG.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -68,11 +68,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
6868
from the graph `codeanalyzer-java` (>= 2.4.0) emits with `--emit neo4j` and answers all 36
6969
`JavaAnalysisBackend` queries with the in-memory backend's logic. Verified against the daytrader8
7070
sample (145 classes): everything the graph actually contains reconstructs identically to
71-
`JCodeanalyzer`. Three producer-side gaps in the 2.4.0 emitter make the graph an incomplete
72-
projection (tracked upstream, not query-layer bugs): all fields of a class collapse to one node
73-
(codeanalyzer-java#156), imports lose the type name (codeanalyzer-java#157), and `J_CALLS`
74-
materializes only a fraction of the call graph (codeanalyzer-java#158). `JavaAnalysis` /
75-
`CLDK.java(...)` accept a `Neo4jConnectionConfig` as the `backend=` config to select it.
71+
`JCodeanalyzer` (97% of checks). Three projection gaps in the `codeanalyzer-java` 2.4.0 emitter
72+
(fields collapsing to one node, imports reduced to packages, a truncated call graph) are **fixed
73+
in 2.4.1** (codeanalyzer-java#156/#157/#158, verified by rebuilding the 2.4.1 jar — `J_CALLS` on
74+
daytrader went 287 → 1702); the SDK still pins 2.4.0, so they apply until 2.4.1 is released.
75+
`JavaAnalysis` / `CLDK.java(...)` accept a `Neo4jConnectionConfig` as the `backend=` config to
76+
select it.
7677
- Bumped `codeanalyzer-python` to `0.2.0` (adds the Neo4j graph emitter); bumped `codeanalyzer-java`
7778
to `2.4.0` (adds the Neo4j graph emitter).
7879
- Optional `neo4j` extra (`pip install cldk[neo4j]`) for the Neo4j Python driver.

cldk/analysis/java/neo4j/neo4j_backend.py

Lines changed: 16 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -41,18 +41,22 @@
4141
scoped under ``(:JApplication {name})-[:J_HAS_UNIT]->(:JCompilationUnit)``.
4242
4343
Parity: this backend reconstructs everything the graph actually contains identically to the
44-
in-memory ``JCodeanalyzer`` (verified on the daytrader8 sample). It cannot recover what the
45-
``codeanalyzer-java`` (2.4.0) emitter drops, however — three known producer-side gaps make the graph
46-
an incomplete projection (tracked upstream, NOT query-layer bugs):
47-
48-
* every ``:JField`` is emitted with the id ``<fqn>#field#null``, so all fields of a class collapse to
49-
one node (codeanalyzer-java#156);
50-
* imports are projected to ``:JPackage`` only, losing the imported type name (codeanalyzer-java#157);
51-
* ``J_CALLS`` materializes only a small fraction of the call graph — edges are absent even when both
52-
endpoint callables are present as nodes (codeanalyzer-java#158).
53-
54-
Projection-lossy-by-design: a ``:JType``'s ``is_class_or_interface_declaration`` /
55-
``is_concrete_class`` flags are not projected (only the ``kind`` discriminator is).
44+
in-memory ``JCodeanalyzer`` (verified on the daytrader8 sample — 97% of checks, the rest being the
45+
caveats below). The ``codeanalyzer-java`` **2.4.0** emitter had three projection gaps — fields all
46+
collapsing to one ``<fqn>#field#null`` node, imports reduced to ``:JPackage``, and ``J_CALLS``
47+
materializing only a fraction of the call graph. All three are **fixed in 2.4.1**
48+
(codeanalyzer-java#156/#157/#158); the SDK currently pins 2.4.0, so with the pinned emitter those
49+
gaps still apply until 2.4.1 is released.
50+
51+
Inherent caveats (present even on a complete graph, NOT query-layer bugs):
52+
53+
* ``J_CALLS`` only links resolved app callables, so call edges to external/library targets (which the
54+
in-memory backend keeps as synthetic nodes) are absent;
55+
* the call graph is built by a separate analyzer run from the in-memory backend's ``analysis.json``,
56+
so the two can differ by run-to-run WALA variance;
57+
* a ``:JType``'s ``is_class_or_interface_declaration`` / ``is_concrete_class`` flags are not
58+
projected (only the ``kind`` discriminator is); an absent singular ``comment`` rehydrates to
59+
``None``.
5660
"""
5761

5862
from __future__ import annotations

cldk/analysis/java/neo4j/reconstruct.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,8 @@ def _arr(props: Props, key: str) -> List[str]:
4747
return list(props.get(key, []) or [])
4848

4949

50+
51+
5052
def _kind_flags(kind: str | None) -> Dict[str, bool]:
5153
"""Derive the type-discriminator booleans from the projected ``kind`` string."""
5254
return {
@@ -92,7 +94,7 @@ def field(props: Props, *, comment_node: dict | None = None) -> dict:
9294
"variables": _arr(props, "variables"),
9395
"modifiers": _arr(props, "modifiers"),
9496
"annotations": _arr(props, "annotations"),
95-
"variable_initializers": json.loads(raw) if raw else None,
97+
"variable_initializers": json.loads(raw) if raw else {},
9698
}
9799

98100

0 commit comments

Comments
 (0)