Split gpu passes into seperate pipelines#4904
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors the GPU target compilation pass list into several named sub-pipelines, making it easier to adjust the pass sequence for different compilation modes without changing the underlying passes.
Changes:
- Introduces a local
pipeline_factoryto group GPU passes into logical pipelines (dynamic shapes, required, optimize/rewrite, fusion, backend). - Reassembles the final pass vector by concatenating these pipelines (preserving the existing pass order).
| std::vector<std::vector<pass>> pipelines = { | ||
| p.dynamic_shapes_pipeline(), | ||
| p.required_pipeline(), | ||
| p.optimize_rewrite_pipeline(), | ||
| p.fusion_pipeline(), | ||
| p.backend_pipeline(), | ||
| }; | ||
| // clang-format on | ||
|
|
||
| std::vector<pass> passes; | ||
| std::copy(pipelines.begin(), pipelines.end(), join_back_inserter(passes)); | ||
| return passes; |
TedThemistokleous
left a comment
There was a problem hiding this comment.
I like the idea of this PR as it provides a cleaner mental model of how passes for GPU interact and gives us flexibility to play with ordering. Certainly good from a context what fits where.
My comments are more about
- Grouping - Quantization ops themselves in a separate pass - Allows us to toggle later
- Outstanding fusion passes (fuse_horizontal) outside of the fusion_pipeline() pass
- General question about dead code elimination (DCE) with respect to passes or back to back. - Possible to optimize back
- Might be a nitpick - Naming of required_pipeline - we should make this "core" if that's or key piece to optimize a model.
Feel free to comment inline here or in github comments.
| fp8_ocp_to_fnuz{}), | ||
| enable_pass(not gpu::gfx_has_fp8ocp_intrinsics() and gpu::gfx_has_fp8fnuz_intrinsics(), | ||
| dead_code_elimination{}), | ||
| simplify_qdq{.use_mx_quant = gpu::gfx_has_mx_intrinsics()}, |
There was a problem hiding this comment.
Do we want to maybe split out the blocks for quantitation here too?
rewrite_quantization, simplify_qdq, fp8 related stuff can be bundled
| dead_code_elimination{}, | ||
| prefuse_ops{get_context()}, | ||
| dead_code_elimination{}, | ||
| dead_code_elimination{}, |
There was a problem hiding this comment.
I know this in the older change-set but this looks odd to me. Is there a reason we're doing this twice vs interleave after every pass?
There was a problem hiding this comment.
I think we moved a pass and didnt remove the extra DCE.
|
|
||
| std::vector<std::vector<pass>> pipelines = { | ||
| p.dynamic_shapes_pipeline(), | ||
| p.required_pipeline(), |
There was a problem hiding this comment.
Name seems odd compared to the others. Can we call this maybe "core pipeline"
| optimize_module{}, | ||
| layout_convolution{.channels_last = enabled(MIGRAPHX_ENABLE_NHWC{})}, | ||
| dead_code_elimination{}, | ||
| enable_pass(disabled(MIGRAPHX_ENABLE_FULL_DYNAMIC{}), fuse_horizontal{}), |
There was a problem hiding this comment.
Is there any way we can put this in the fusion_pipeline pipeline then? Does ordering matter for this one?
There was a problem hiding this comment.
It shouldn't go in the fusion pipeline. And ordering does matter for this,
TedThemistokleous
left a comment
There was a problem hiding this comment.
Approving this since the intent is to not change the passes here.
kahmed10
left a comment
There was a problem hiding this comment.
I think we should still mention this in the changelog somewhere because this grouping could be referenced for future versions.
We could look into having the ability/API for disabling passes within a pipeline in the future for finer-grained control.
Otherwise LGTM
I dont think that makes sense at all, and there is no selection in the changelog for no functionality changes.
We could add to the changelog when we expose those things to the API as that will add new functionality, but its very unlikely we would, and I am strongly opposed to exposing this to the API. That is the whole reason for adding the compilation modes. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #4904 +/- ##
========================================
Coverage 92.66% 92.66%
========================================
Files 588 588
Lines 30412 30412
========================================
Hits 28180 28180
Misses 2232 2232 🚀 New features to boost your workflow:
|
Ok that's fine. I thought we wanted the pipelines to be configurable. But if we are just using it as a way to support different compilation modes, that's fine too. |
Motivation
This will make it easier to use to change the passes needed for different compilation modes.
Technical Details
None of the passes are changed here. In the future we could look at moving some of the passes around to put into another pipeline that makes more sense.
Changelog Category
Add a
CHANGELOG.mdentry for any option other thanNot Applicable