Skip to content

Simplify ck_gemm_a8w8_blockscale GemmSpecialization construction#3813

Open
jbelloncastro wants to merge 1 commit into
ROCm:mainfrom
jbelloncastro:jorgeb/gemmspec-bitset
Open

Simplify ck_gemm_a8w8_blockscale GemmSpecialization construction#3813
jbelloncastro wants to merge 1 commit into
ROCm:mainfrom
jbelloncastro:jorgeb/gemmspec-bitset

Conversation

@jbelloncastro

Copy link
Copy Markdown

Motivation

Simplifies the logic behind constructing an enum value for GemmSpecialization.

Technical Details

The global variable unordered_map is constructed during library initialization and performs memory allocations that are not really necessary to find the right configuration from the dimension and block size parameters.

The change reassigns the enum values such that they can be treated as a mask. The padding conditions simply add a new bit to the mask if necessary.

The value reassignment will make chains of conditionals conditionals slightly more complicated, but not as much as the previous cost of enum construction.

Test Plan

Test Result

Submission Checklist

@jbelloncastro jbelloncastro requested a review from a team June 19, 2026 15:50
@github-actions

Copy link
Copy Markdown
Contributor

🏷️ CI Guide

Runs automatically on every PR:

  • ✅ Pre-checks (submodule verification, code formatting)
  • ✅ Aiter op tests (gfx942 + gfx950)
  • ✅ Triton tests on MI35X (only when aiter/ops/triton/** or related paths are changed)

Extended tests (opt-in via labels):

Label Tests
ci:triton-300x Run an additional Triton test job on MI300X in PRs; main branch always runs both MI35X and MI300X
ci:sglang SGLang integration tests: DeepSeek-R1-MXFP4 accuracy, Qwen 3.5 accuracy
ci:atom ATOM benchmark: DeepSeek-R1-0528, GPT-OSS-120B
ci:atom_full ATOM accuracy suite for PR and main models from ATOM models_accuracy.json
ci:vllm vLLM benchmark: GPT-OSS-120B, DeepSeek-R1-0528, Kimi-K2.5
ci:all All standard extended tests (excludes ci:atom_full)

Only add ci:atom_full for FlyDSL or Triton upgrades.
Add labels via the sidebar or gh pr edit 3813 --add-label <label>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant