Skip to content

fix(parsers/js): extract module-level block-scoped functions#124

Open
gadievron wants to merge 1 commit into
staging/parser-fix-stackfrom
fix/js-block-scoped-extraction
Open

fix(parsers/js): extract module-level block-scoped functions#124
gadievron wants to merge 1 commit into
staging/parser-fix-stackfrom
fix/js-block-scoped-extraction

Conversation

@gadievron

Copy link
Copy Markdown
Collaborator

What

Extract module-level block-scoped function declarations (inside if/try/for/switch/bare blocks), which the JS/TS analyzer silently dropped.

Root cause

extractFunctionsFromFile and buildCallGraphForFile both enumerated functions via sourceFile.getFunctions() — which returns only top-level declarations. A function declared inside any block was therefore absent from BOTH the function inventory AND the call graph (typescript_analyzer.js:231 / :1253). The resolver's standalone-function strategy had the same blind spot (:1588). For a SAST tool this is a recall gap: a function gated behind if (process.env.X) {...} or defined in a catch fallback is never a unit, never a reachability node, never an entry point — so it is never analyzed, and its call to (e.g.) eval/child_process.exec is invisible.

Reproduction

$ printf 'if (c) { function vuln(){ return eval(x); } }\nfunction top(){}' > /tmp/r/a.js
$ node typescript_analyzer.js /tmp/r /tmp/r/a.js | jq '.functions | keys'
["a.js:top"]            # vuln dropped

How

  • New _moduleLevelFunctionNodes / _moduleLevelFunctionEntries: descend the AST for FunctionDeclarations and keep those with NO function-like ancestor — surfacing module-level block-scoped functions while omitting functions nested inside another function/method (their text already rides inside the parent unit) and inside a class static {} block (block-scoped to the initializer, callable nowhere).
  • The inventory builder, the call-graph builder, and the resolver all iterate this one list, so their keys match exactly — a block function gets its real outgoing edges (not a backstop []) and its incoming calls resolve, so it is reachable.
  • Sibling-block same-name functions (if(c){function f(){A}}else{function f(){B}}) are both runtime-reachable, so keep BOTH via a colon-free #L<line> suffix; collision-only, so unique names keep the byte-identical path:name id.

Scope: deliberately NOT touched — functions nested inside another function/method (text rides in the parent), static {}-block functions (uncallable), arrow/function-expression assignments inside blocks (separate variable-declaration extraction path; follow-up), classes-in-blocks (follow-up). The resolver's constructor-pattern and cross-file-import strategies stay top-level (a block-scoped function can't be exported/imported).

Regression test

tests/parsers/javascript/test_block_scoped_functions.py — 8 block shapes surfaced with real edges; incoming-edge resolution; sibling-block keep-both; static-block and nested-in-function omitted; the len(callGraph) == len(functions) invariant; unique-name id unchanged.

RED (before fix):

9 failed  (the 8 block shapes + sibling keep-both)

GREEN (after fix):

$ pytest tests/parsers/javascript/test_block_scoped_functions.py -q
14 passed
$ pytest tests/parsers/javascript/ -q
66 passed

Full repo suite: 615 passed, 22 skipped. express smoke: 7 files, callGraph keys == functions keys, 0 inflation (no block collisions in idiomatic code).

Compatibility

No id change for code without block-scoped functions: getDescendantsOfKind returns top-level declarations in the same document order as getFunctions(), so last-writer-wins resolves identically (verified — express output unchanged). New units are additive; the call-graph/inventory key invariant is preserved.

Note: tests/parsers/javascript/test_callgraph_symmetry.py does not yet exist at the canonical path (a pre-existing repo-wide gap); this PR's regression test asserts the functions↔callGraph key invariant inline as test_invariant_callgraph_keys_equal_functions_keys.

The JS/TS analyzer enumerated functions via `sourceFile.getFunctions()`
(top-level only) in BOTH the inventory (extractFunctionsFromFile) and the call
graph (buildCallGraphForFile), so a function declared inside any block — a
top-level `if/else`, `try/catch`, `for/while/switch`, or bare `{}` — was
silently dropped from both. For a SAST tool a vulnerable function gated behind
`if (process.env.X)` or in a `catch` fallback was never a unit, never a
reachability node, never an entry point.

Add `_moduleLevelFunctionNodes` / `_moduleLevelFunctionEntries`: descend the AST
for FunctionDeclarations and keep those with NO function-like ancestor — surfacing
module-level block-scoped functions while still omitting functions nested inside
another function/method (their text rides in the parent unit) and inside a class
`static {}` block (block-scoped to the initializer, callable nowhere). Both the
inventory and call-graph builders iterate this same list so their keys match
exactly (the `len(callGraph) === len(functions)` lockstep — a block function gets
its real outgoing edges, not a backstop `[]`). The resolver's standalone-function
strategy also uses it, so calls TO a block function resolve (it stays reachable).

Sibling-block same-name functions (`if(c){function f(){A}}else{function f(){B}}`)
are both runtime-reachable, so keep BOTH via a colon-free `#L<line>` suffix;
collision-only, so unique names keep the byte-identical `path:name` id.

Tests: tests/parsers/javascript/test_block_scoped_functions.py (8 block shapes
surfaced with real edges; incoming-edge resolution; keep-both; static-block and
nested-in-function omitted; invariant; unique-id unchanged):
  $ pytest tests/parsers/javascript/test_block_scoped_functions.py -q
  14 passed
  $ pytest tests/parsers/javascript/ -q
  66 passed
Full repo suite: 615 passed, 22 skipped. express smoke: 7 files, callGraph
keys == functions keys, no inflation.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant