Skip to content

COMP: Use pzero() for SelfadjointMatrixVector accumulators (silences GCC -O3 IPA-CP false positives)#6166

Merged
hjmjohnson merged 1 commit into
InsightSoftwareConsortium:mainfrom
hjmjohnson:eigen3-pzero-init-selfadjoint
Apr 29, 2026
Merged

COMP: Use pzero() for SelfadjointMatrixVector accumulators (silences GCC -O3 IPA-CP false positives)#6166
hjmjohnson merged 1 commit into
InsightSoftwareConsortium:mainfrom
hjmjohnson:eigen3-pzero-init-selfadjoint

Conversation

@hjmjohnson
Copy link
Copy Markdown
Member

Switch the two ptmp2 / ptmp3 packet-accumulator initializations in Eigen/src/Core/products/SelfadjointMatrixVector.h from pset1<Packet>(t) to pzero(Packet{}), matching upstream Eigen master. This silences four false-positive -Wmaybe-uninitialized warnings GCC emits at -O3 after IPA-CP clones the function as run.constprop.isra.

Must merge before #6164, which switches ITK.Linux.Python CI from MinSizeRel to Release and is currently red because of these exact warnings.

Why the existing in-header diagnostic-pop doesn't help

Modules/ThirdParty/Eigen3/src/itkeigen/Eigen/src/Core/products/SelfadjointMatrixVector.h already wraps its body in:

#if defined(__GNUC__) && !defined(__clang__)
#  pragma GCC diagnostic push
#  pragma GCC diagnostic ignored "-Wmaybe-uninitialized"
#endif

That suppression is stateful at parse time — it applies while GCC parses this header. The actual warnings are emitted from selfadjoint_matrix_vector_product::run.constprop.isra, a cloned function GCC creates during interprocedural constant propagation at -O2/-O3. The diagnostic state at the point GCC analyses the clone is whatever the call-site translation unit had active, not the suppression baked into the header. This is a long-standing GCC interaction with -fipa-cp-clone.

The proper fix is therefore at the source: give GCC IPA-CP a clearly-zero starting packet so it doesn't ever build a "maybe uninit" hypothesis in the first place.

What changed (2-line diff)
     Scalar t2(0);
-    Packet ptmp2 = pset1<Packet>(t2);
+    Packet ptmp2 = pzero(Packet{});
     Scalar t3(0);
-    Packet ptmp3 = pset1<Packet>(t3);
+    Packet ptmp3 = pzero(Packet{});

Algorithmic behaviour is unchanged: pset1<Packet>(0) and pzero(Packet{}) both produce a zero packet. Upstream Eigen master uses the latter form because:

  • Packet{} is value-initialization (explicit zero, visible to the optimizer as a constant).
  • pzero is a small inline that returns a zero packet of the deduced type.
  • The combination collapses cleanly under IPA-CP, leaving no "maybe uninit" hypothesis.

pzero already exists in our vendored Eigen at Modules/ThirdParty/Eigen3/src/itkeigen/Eigen/src/Core/GenericPacketMath.h:425, so no new symbols are needed.

Upstream Eigen master also did a separate, larger refactor of this same kernel (2-column → 4-column processing, four accumulators ptmp4ptmp7). That refactor is intentionally not included here — it's a bigger change with its own performance trade-offs and reviewer surface, orthogonal to fixing the warning.

Why this affects ITK.Linux.Python specifically

ITK.Linux.Python builds the SuperBuild with ITK_WRAP_PYTHON=ON, which generates many SWIG wrapper translation units that include Eigen and instantiate selfadjoint_matrix_vector_product. The combination of (a) -O3 (Release), (b) GCC's -fipa-cp-clone enabled at that level, and (c) the high template instantiation count is what makes the warnings reproducible there. MinSizeRel (-Os) typically disables IPA-CP cloning, which is why the warnings only appeared after #6164's MinSizeRel→Release switch.

Empirical CDash evidence from #6164 build #14600: 0 errors, 4 warnings, 175 tests passed. The warning text was:

SelfadjointMatrixVector.h:231: warning: '_28' may be used uninitialized [-Wmaybe-uninitialized]
SelfadjointMatrixVector.h:231: warning: 'result_90' may be used uninitialized [-Wmaybe-uninitialized]

_28 and result_90 are GCC SSA temporaries inside the cloned function; they correspond to ptmp2/ptmp3 after IPA-CP rewriting.

Test plan

@github-actions github-actions Bot added type:Compiler Compiler support or related warnings area:ThirdParty Issues affecting the ThirdParty module labels Apr 29, 2026
@hjmjohnson hjmjohnson marked this pull request as ready for review April 29, 2026 10:46
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 29, 2026

Greptile Summary

This is a minimal, targeted two-line fix in the vendored Eigen SelfadjointMatrixVector kernel. It replaces pset1<Packet>(t2/t3) (where t2/t3 are zero-initialized scalars) with pzero(Packet{}), producing semantically identical zero-valued packet accumulators while giving GCC's IPA-CP pass a clearly-constant zero that eliminates false-positive -Wmaybe-uninitialized warnings in -O3 builds. pzero is already present in the vendored GenericPacketMath.h (line 425), and the scalar variables t2/t3 continue to serve their own role as scalar accumulators in the loop body.

Confidence Score: 5/5

Safe to merge — the two-line change is semantically equivalent and directly matches the upstream Eigen fix.

pzero(Packet{}) and pset1<Packet>(Scalar(0)) produce identical zero-valued packets; pzero is already present in the vendored Eigen headers; scalar accumulators t2/t3 are unaffected; no algorithmic behaviour changes. The fix is well-motivated, minimal, and aligned with upstream Eigen master.

No files require special attention.

Important Files Changed

Filename Overview
Modules/ThirdParty/Eigen3/src/itkeigen/Eigen/src/Core/products/SelfadjointMatrixVector.h Replaces pset1<Packet>(t2/t3) with pzero(Packet{}) for zero-initialized packet accumulators; semantically identical but eliminates GCC IPA-CP false-positive -Wmaybe-uninitialized warnings at -O3.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["selfadjoint_matrix_vector_product::run()"] --> B["Initialize ptmp0 = pset1(t0)\nInitialize ptmp1 = pset1(t1)"]
    B --> C["Initialize ptmp2 = pzero(Packet{})\n(was: pset1(t2) where t2=0)\nInitialize ptmp3 = pzero(Packet{})\n(was: pset1(t3) where t3=0)"]
    C --> D{"FirstTriangular?"}
    D -- Yes --> E["Accumulate t3 scalar\nAccumulate ptmp2/ptmp3 packets\nvia aligned vectorized loop"]
    D -- No --> F["Accumulate t2 scalar\nAccumulate ptmp2/ptmp3 packets\nvia aligned vectorized loop"]
    E --> G["Reduce packets → scalars\nt2 += predux(ptmp2)\nt3 += predux(ptmp3)"]
    F --> G
    G --> H["Write results back to res[j], res[j+1]"]
Loading

Reviews (1): Last reviewed commit: "COMP: Use pzero() for SelfadjointMatrixV..." | Re-trigger Greptile

GCC -O3 IPA-CP emits false-positive -Wmaybe-uninitialized on the
cloned `selfadjoint_matrix_vector_product::run.constprop.isra`:
`pset1<Packet>(Scalar(0))` is opaque to interprocedural constant
propagation, while `pzero(Packet{})` (used by upstream Eigen master)
is visible as a constant. Backport just the ptmp2/ptmp3 init lines.
@hjmjohnson hjmjohnson force-pushed the eigen3-pzero-init-selfadjoint branch from e4799ec to 91f6232 Compare April 29, 2026 10:50
@hjmjohnson hjmjohnson requested a review from dzenanz April 29, 2026 11:12
@hjmjohnson
Copy link
Copy Markdown
Member Author

@dzenanz Changing from RelMinSize -> Release in #6164 invoked the IPO toolchain that exposed an eigen warning. This PR backports a more robust eigen variable initialization that propagates through the IPO tool chain to eliminate the warnings.

Copy link
Copy Markdown
Member

@dzenanz dzenanz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM on a glance.

@hjmjohnson hjmjohnson merged commit 527b6c7 into InsightSoftwareConsortium:main Apr 29, 2026
13 of 16 checks passed
hjmjohnson added a commit that referenced this pull request May 1, 2026
Follow-up to #6166. On ubuntu-24.04 + GCC 13/14 + C++20 (the toolchain
that #6164 introduces), GCC -O3 IPA-CP emits additional false-positive
-Wmaybe-uninitialized at the outer selfadjoint_product_impl::run()
call site, attributing the read to "argument 3 of type const float *"
of the cloned selfadjoint_matrix_vector_product::run.constprop.isra.

Add pzero(Packet{}) declare-then-assign init for the four Packet
locals (A0i, A1i, Bi, Xi) loaded from the lhs/rhs/res pointers inside
the inner loop. Mirrors the #6166 ptmp2/ptmp3 pattern: pzero() is
visible to GCC's IPA-CP as a definite zero starting state, while a
bare declaration plus subsequent ploadu/pload is opaque and lets the
optimizer trace an uninit path through cloned variants.
@hjmjohnson hjmjohnson added this to the ITK 6.0 Release Candidate 1 milestone May 5, 2026
hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request May 6, 2026
…en3-pzero-init-selfadjoint

COMP: Use pzero() for SelfadjointMatrixVector accumulators (silences GCC -O3 IPA-CP false positives)
hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request May 6, 2026
Follow-up to InsightSoftwareConsortium#6166. On ubuntu-24.04 + GCC 13/14 + C++20 (the toolchain
that InsightSoftwareConsortium#6164 introduces), GCC -O3 IPA-CP emits additional false-positive
-Wmaybe-uninitialized at the outer selfadjoint_product_impl::run()
call site, attributing the read to "argument 3 of type const float *"
of the cloned selfadjoint_matrix_vector_product::run.constprop.isra.

Add pzero(Packet{}) declare-then-assign init for the four Packet
locals (A0i, A1i, Bi, Xi) loaded from the lhs/rhs/res pointers inside
the inner loop. Mirrors the InsightSoftwareConsortium#6166 ptmp2/ptmp3 pattern: pzero() is
visible to GCC's IPA-CP as a definite zero starting state, while a
bare declaration plus subsequent ploadu/pload is opaque and lets the
optimizer trace an uninit path through cloned variants.
hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request May 12, 2026
…en3-pzero-init-selfadjoint

COMP: Use pzero() for SelfadjointMatrixVector accumulators (silences GCC -O3 IPA-CP false positives)
hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request May 12, 2026
Follow-up to InsightSoftwareConsortium#6166. On ubuntu-24.04 + GCC 13/14 + C++20 (the toolchain
that InsightSoftwareConsortium#6164 introduces), GCC -O3 IPA-CP emits additional false-positive
-Wmaybe-uninitialized at the outer selfadjoint_product_impl::run()
call site, attributing the read to "argument 3 of type const float *"
of the cloned selfadjoint_matrix_vector_product::run.constprop.isra.

Add pzero(Packet{}) declare-then-assign init for the four Packet
locals (A0i, A1i, Bi, Xi) loaded from the lhs/rhs/res pointers inside
the inner loop. Mirrors the InsightSoftwareConsortium#6166 ptmp2/ptmp3 pattern: pzero() is
visible to GCC's IPA-CP as a definite zero starting state, while a
bare declaration plus subsequent ploadu/pload is opaque and lets the
optimizer trace an uninit path through cloned variants.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:ThirdParty Issues affecting the ThirdParty module type:Compiler Compiler support or related warnings

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants