fix(parsers/js): extract module-level block-scoped functions#124
Open
gadievron wants to merge 1 commit into
Open
fix(parsers/js): extract module-level block-scoped functions#124gadievron wants to merge 1 commit into
gadievron wants to merge 1 commit into
Conversation
The JS/TS analyzer enumerated functions via `sourceFile.getFunctions()`
(top-level only) in BOTH the inventory (extractFunctionsFromFile) and the call
graph (buildCallGraphForFile), so a function declared inside any block — a
top-level `if/else`, `try/catch`, `for/while/switch`, or bare `{}` — was
silently dropped from both. For a SAST tool a vulnerable function gated behind
`if (process.env.X)` or in a `catch` fallback was never a unit, never a
reachability node, never an entry point.
Add `_moduleLevelFunctionNodes` / `_moduleLevelFunctionEntries`: descend the AST
for FunctionDeclarations and keep those with NO function-like ancestor — surfacing
module-level block-scoped functions while still omitting functions nested inside
another function/method (their text rides in the parent unit) and inside a class
`static {}` block (block-scoped to the initializer, callable nowhere). Both the
inventory and call-graph builders iterate this same list so their keys match
exactly (the `len(callGraph) === len(functions)` lockstep — a block function gets
its real outgoing edges, not a backstop `[]`). The resolver's standalone-function
strategy also uses it, so calls TO a block function resolve (it stays reachable).
Sibling-block same-name functions (`if(c){function f(){A}}else{function f(){B}}`)
are both runtime-reachable, so keep BOTH via a colon-free `#L<line>` suffix;
collision-only, so unique names keep the byte-identical `path:name` id.
Tests: tests/parsers/javascript/test_block_scoped_functions.py (8 block shapes
surfaced with real edges; incoming-edge resolution; keep-both; static-block and
nested-in-function omitted; invariant; unique-id unchanged):
$ pytest tests/parsers/javascript/test_block_scoped_functions.py -q
14 passed
$ pytest tests/parsers/javascript/ -q
66 passed
Full repo suite: 615 passed, 22 skipped. express smoke: 7 files, callGraph
keys == functions keys, no inflation.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Extract module-level block-scoped function declarations (inside
if/try/for/switch/bare blocks), which the JS/TS analyzer silently dropped.Root cause
extractFunctionsFromFileandbuildCallGraphForFileboth enumerated functions viasourceFile.getFunctions()— which returns only top-level declarations. A function declared inside any block was therefore absent from BOTH the function inventory AND the call graph (typescript_analyzer.js:231/:1253). The resolver's standalone-function strategy had the same blind spot (:1588). For a SAST tool this is a recall gap: a function gated behindif (process.env.X) {...}or defined in acatchfallback is never a unit, never a reachability node, never an entry point — so it is never analyzed, and its call to (e.g.)eval/child_process.execis invisible.Reproduction
How
_moduleLevelFunctionNodes/_moduleLevelFunctionEntries: descend the AST forFunctionDeclarations and keep those with NO function-like ancestor — surfacing module-level block-scoped functions while omitting functions nested inside another function/method (their text already rides inside the parent unit) and inside a classstatic {}block (block-scoped to the initializer, callable nowhere).[]) and its incoming calls resolve, so it is reachable.if(c){function f(){A}}else{function f(){B}}) are both runtime-reachable, so keep BOTH via a colon-free#L<line>suffix; collision-only, so unique names keep the byte-identicalpath:nameid.Scope: deliberately NOT touched — functions nested inside another function/method (text rides in the parent),
static {}-block functions (uncallable), arrow/function-expression assignments inside blocks (separate variable-declaration extraction path; follow-up), classes-in-blocks (follow-up). The resolver's constructor-pattern and cross-file-import strategies stay top-level (a block-scoped function can't be exported/imported).Regression test
tests/parsers/javascript/test_block_scoped_functions.py— 8 block shapes surfaced with real edges; incoming-edge resolution; sibling-block keep-both; static-block and nested-in-function omitted; thelen(callGraph) == len(functions)invariant; unique-name id unchanged.RED (before fix):
GREEN (after fix):
Full repo suite: 615 passed, 22 skipped. express smoke: 7 files, callGraph keys == functions keys, 0 inflation (no block collisions in idiomatic code).
Compatibility
No id change for code without block-scoped functions:
getDescendantsOfKindreturns top-level declarations in the same document order asgetFunctions(), so last-writer-wins resolves identically (verified — express output unchanged). New units are additive; the call-graph/inventory key invariant is preserved.Note:
tests/parsers/javascript/test_callgraph_symmetry.pydoes not yet exist at the canonical path (a pre-existing repo-wide gap); this PR's regression test asserts the functions↔callGraph key invariant inline astest_invariant_callgraph_keys_equal_functions_keys.