Skip to content

Handle reductions in get_insn_access_map#1009

Open
majosm wants to merge 2 commits into
inducer:mainfrom
majosm:fix-get-insn-access-map
Open

Handle reductions in get_insn_access_map#1009
majosm wants to merge 2 commits into
inducer:mainfrom
majosm:fix-get-insn-access-map

Conversation

@majosm
Copy link
Copy Markdown
Contributor

@majosm majosm commented May 12, 2026

Currently, get_insn_access_map only passes the inames from within_inames in the call to get_access_map. As a result, if reductions are present get_access_map will may (edit: it only happens if the reduction domain is separate from the element domain, otherwise knl.get_inames_domain() picks up the reduction) fail, e.g.:

unable to determine access range of subscript: [_pt_temp_12_dim0, _pt_sum_r0_8] (encountered ExpressionToAffineConversionError: could not convert expression '_pt_sum_r0_8' to affine representation: UnknownVariableError: cannot find '_pt_sum_r0_8' in context (pass it in on evaluation))

This causes _compute_isinfusible_via_access_map to return True whenever reductions are present (code), which subsequently prevents get_kennedy_unweighted_fusion_candidates from fusing loops that it potentially could.

A real world example of this can be seen in the generated code for a DG operator here. I've annotated with # DESCRIPTION: what DG operations are being done in each loop in the device code. The first 3 loop nests are performing face-local work on the interior faces (the iel loop with 5641 iterations). Since the operations are face-local, the iel loops should be able to be fused, but the reductions over the idof axes are preventing that from happening due to this issue.

This PR changes get_insn_access_map to get_insn_access_maps. It now computes separate access maps for accesses outside of reductions + each different reduction level present, by traversing the instruction and updating the domain for reductions accordingly (AFAIK, they cannot all be unioned together due to the different spaces involved). It additionally modifies compute_isinfusible_via_access_map to do the unioning after projection once it's safe to do so.

With this change, I'm now seeing fusion of the element loops as expected: code.

@majosm majosm marked this pull request as draft May 12, 2026 20:53
@majosm majosm marked this pull request as ready for review May 12, 2026 21:27
@majosm
Copy link
Copy Markdown
Contributor Author

majosm commented May 12, 2026

I can't request review on this repo, so @kaushikcfd @inducer this is probably ready for a glance.

@majosm
Copy link
Copy Markdown
Contributor Author

majosm commented May 12, 2026

I should mention: this is more or less my first time doing anything nontrivial with isl, so extra scrutiny is likely warranted. 🙂

One thing I'm not clear on is whether I should be using project_out_except after get_inames_domain (e.g., in _InstructionAccessMapCollector.map_reduction).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant