fix(parsers/php,ruby): keep both same-name defs from conditional/guarded branches#122
Open
gadievron wants to merge 2 commits into
Open
fix(parsers/php,ruby): keep both same-name defs from conditional/guarded branches#122gadievron wants to merge 2 commits into
gadievron wants to merge 2 commits into
Conversation
…ranches
PHP forbids plain function redefinition (fatal), so legal same-(file,name)
duplicates arise only from mutually-exclusive conditional branches — an
`if/else` or a defensive double `if(!function_exists('x')){ function x(){} }`.
Which branch is live is environment-dependent, and the EARLIER branch is often
the one that runs (the first `function_exists` guard defines it; the second is
skipped). The extractor keyed on `qualified_name` with a plain keep-last store,
so the later (dead) branch silently overwrote the earlier (live) one — a false
negative for a SAST tool, confirmed against the real `php` interpreter.
New `_store_function` keeps BOTH via a deterministic `#L<line>` suffix (the
earlier-in-source unit keeps the clean id), mirroring the Python extractor's
existing `_store_function`. Collision-only: a unique name keeps its byte-identical
`path:name` id. Both qualified-name store sites (function :558, closure :600)
route through it; the per-file `__module__` singleton is unaffected.
Tests: tests/parsers/php/test_php_conditional_collision.py (if/else both kept,
double function_exists guard both kept, unique-name id unchanged):
$ pytest tests/parsers/php/test_php_conditional_collision.py -q
3 passed
$ pytest tests/parsers/php/ -q
36 passed
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A Ruby `def` executes when reached, so same-name defs in mutually-exclusive conditional branches (`if COND then def k;A end else def k;B end end`) are both runtime-reachable depending on the condition, and the EARLIER (if) branch is often the live one. The extractor's plain keep-last store let the later (else) branch silently overwrite the earlier — a false negative for a SAST tool, confirmed against the real `ruby` interpreter. New `_store_function` keeps BOTH via a deterministic `#L<line>` suffix (the earlier-in-source unit keeps the clean id), mirroring the Python/PHP extractors. Unconditional method reopening is last-wins at runtime; keeping both there is the same benign tradeoff Python already accepts (Stage 2 attacker-simulation filters dead overrides). Collision-only: a unique name keeps its byte-identical `path:name` id. Both qualified-name store sites (:549 method, :613 function) route through it. Tests: tests/parsers/ruby/test_ruby_conditional_collision.py (if/else both kept, unique-name id unchanged): $ pytest tests/parsers/ruby/test_ruby_conditional_collision.py -q 2 passed $ pytest tests/parsers/ruby/ -q 47 passed Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Companion to #121 (the C/C++ same-
(file,name)collision fix). A blind independent re-derivation + an adversarial judge — both running the realphp/ruby/nodeinterpreters — established that of the non-C parsers, only PHP and Ruby have a harmful same-name collision; JS and Python do not (JS keeps the last-wins copy / skips conditional defs; Python already keeps both via a#L<line>suffix). This PR fixes PHP and Ruby; it deliberately does not touch JS/Python.The bug (PHP + Ruby)
A PHP
function/Rubydefin a mutually-exclusive conditional branch is runtime-reachable depending on the environment, and the earlier-in-source branch is often the live one:Both extractors keyed on
qualified_namewith a plain keep-last store, so the later (often dead) branch silently overwrote the earlier (often live) one — a false negative for a SAST tool. Confirmed:phpprintsreal()/rubyrunsif_branch, but the extractor kept the other branch.The fix — keep BOTH (not prefer-first/larger)
Which branch is live is environment-dependent, so neither is statically dead. The fix keeps both via a deterministic
#L<line>suffix (earlier-in-source keeps the clean id) — the exact_store_functionstrategy the Python extractor already uses, which is precisely why Python was found immune. Ported into the PHP (:558,:600) and Ruby (:549,:613) store sites.path:nameid (pinned by atest_unique_name_id_unchangedper language).Tests
New:
tests/parsers/php/test_php_conditional_collision.py,tests/parsers/ruby/test_ruby_conditional_collision.py.Full repo suite: 606 passed, 22 skipped.