[feature] Extract C++ code to jinja template files by jbelloncastro · Pull Request #3801 · ROCm/aiter

jbelloncastro · 2026-06-18T10:10:49Z

Motivation

This PR aims to improve readability and maintainability by separating the C++ codegen logic from the file contents.

This is merely a feature proposal and therefore the changes are contained within a single source directory (csrc/ck_gemm_a8w8/)

Technical Details

The C++ code is placed in jinja template files. This makes the python code easier to read as it only needs to produce the appropriate data structures, while the code resides in its own files. The changes thus far are contained within csrc/ck_gemm_a8w8/, but are applicable to the rest of csrc/ subdirectories. This potentially enables reuse across modules, since many of the generated files have very similar structure.

Test Plan

Generate source files. The result files should be nearly identical to the original and also compile successfully.

GPU_ARCHS='gfx950;gfx942' python gen_instances.py -w /tmp -f ../../aiter/configs/a8w8_tuned_gemm.csv

CI regression should not produce any test regressions

Test Result

All files match

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

github-actions · 2026-06-18T10:11:24Z

🏷️ CI Guide

Runs automatically on every PR:

✅ Pre-checks (submodule verification, code formatting)
✅ Aiter op tests (gfx942 + gfx950)
✅ Triton tests on MI35X (only when aiter/ops/triton/** or related paths are changed)

Extended tests (opt-in via labels):

Label	Tests
`ci:triton-300x`	Run an additional Triton test job on MI300X in PRs; main branch always runs both MI35X and MI300X
`ci:sglang`	SGLang integration tests: DeepSeek-R1-MXFP4 accuracy, Qwen 3.5 accuracy
`ci:atom`	ATOM benchmark: DeepSeek-R1-0528, GPT-OSS-120B
`ci:atom_full`	ATOM accuracy suite for PR and main models from ATOM `models_accuracy.json`
`ci:vllm`	vLLM benchmark: GPT-OSS-120B, DeepSeek-R1-0528, Kimi-K2.5
`ci:all`	All standard extended tests (excludes `ci:atom_full`)

Only add ci:atom_full for FlyDSL or Triton upgrades.
Add labels via the sidebar or gh pr edit 3801 --add-label <label>

Clicked the wrong button

jbelloncastro added 4 commits June 17, 2026 18:32

CK GEMM A8W8

bbaac06

Add missing template

42838a2

Fixes

c1c3c11

Formatting

a318d32

jbelloncastro requested a review from a team June 18, 2026 10:10

pschlan-amd previously approved these changes Jun 19, 2026

View reviewed changes

Merge branch 'main' into jorgeb/codegen-jinja

762e3e7

jbelloncastro added 2 commits June 19, 2026 15:28

Fix tuple key nesting

46a5956

Fix ruff and black checks

f67140a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature] Extract C++ code to jinja template files#3801

[feature] Extract C++ code to jinja template files#3801
jbelloncastro wants to merge 7 commits into
ROCm:mainfrom
jbelloncastro:jorgeb/codegen-jinja

jbelloncastro commented Jun 18, 2026

Uh oh!

github-actions Bot commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jbelloncastro commented Jun 18, 2026

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Uh oh!

github-actions Bot commented Jun 18, 2026

🏷️ CI Guide

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants