-
Notifications
You must be signed in to change notification settings - Fork 366
Pull requests: ROCm/aiter
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[TRITON] Tuned DSV4-Flash FP8 GEMM configs
#3814
opened Jun 19, 2026 by
skysnow2001
Contributor
Loading…
1 task done
Simplify ck_gemm_a8w8_blockscale GemmSpecialization construction
#3813
opened Jun 19, 2026 by
jbelloncastro
Loading…
1 task done
[FlyDSL] Fix MoE 2-stage bf16 weight buffer overflow for weights >4GiB
#3812
opened Jun 19, 2026 by
yueliu14
Loading…
Qwen3.5-397B-A17B MXFP4: add tuned flydsl fused-MoE config (MI355X)
#3809
opened Jun 19, 2026 by
jiangyon-amd
Loading…
1 task
fix(fmoe tune): dedup fused/non-fused stage1 separately
#3805
opened Jun 18, 2026 by
rbrugaro-amd
Contributor
•
Draft
[HIP][FLYDSL]: add multi-backend prefill causal_conv1d kernels for GDN
#3803
opened Jun 18, 2026 by
yiijin
Contributor
Loading…
[feature] Extract C++ code to jinja template files
#3801
opened Jun 18, 2026 by
jbelloncastro
Loading…
1 task done
[gfx950] Add JIT grouped_gemm_mxfp8 for MXFP8 prefill MoE
#3800
opened Jun 18, 2026 by
fanxingran
Loading…
1 task
[MI450]tune Dsv4 config for gemm_a8w8_blockscale
#3798
opened Jun 18, 2026 by
Dewei-Wang-sh
Contributor
•
Draft
idxsknorm_shuffle_layout support shuffle kv cache layout
#3795
opened Jun 18, 2026 by
ganyi1996ppo
Contributor
Loading…
1 task
[Triton] [GFX1250] BF16 GEMM add DSV4 config and add kernel_type in config
#3790
opened Jun 18, 2026 by
k50112113
Contributor
Loading…
[FlyDSL] Port compress_attn kernels to gfx1250 (wave32)
#3787
opened Jun 18, 2026 by
jli-melchior
Contributor
Loading…
1 task
[fea] Add fp32 RMSNorm output for fused qk group quant
#3785
opened Jun 18, 2026 by
wuhuikx
Contributor
Loading…
[Small_M_GEMM_GroupGEMM_MXFP8] Decode small-M MX-FP8 GEMM and GroupGEMM kernels for gfx950
#3783
opened Jun 17, 2026 by
JohnQinAMD
Contributor
Loading…
1 task
perf: use vectorized LDS loads for mhc_pre_gemm_sqrsum on gfx942
#3781
opened Jun 17, 2026 by
kudomcho
Loading…
1 task done
[module_fused_split_gdr_update] refactor
#3777
opened Jun 17, 2026 by
amd-ruitang3
Contributor
Loading…
1 task
[Triton] Detect rocm-core version via rpm on RPM-based systems
ci:triton-300x
#3775
opened Jun 17, 2026 by
mengfei-jiang
Contributor
Loading…
1 task
fix: disable EP topk-1 strip
ci:atom
ci:triton-300x
ci:triton-355
#3771
opened Jun 17, 2026 by
JiaoliangYu
Loading…
1 task
[FlyDSL AOT] Parallelize standalone main() compile drivers
#3769
opened Jun 17, 2026 by
zhiding512
Contributor
Loading…
[gfx1250][flydsl]moe group gemm swiglu limit for dsv4
#3767
opened Jun 17, 2026 by
Zzz9990
Contributor
Loading…
1 task
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.