[ROCm] Enable USE_FBGEMM_GENAI (#160676)

Summary:
X-link: https://github.com/pytorch/FBGEMM/pull/4703

X-link: https://github.com/facebookresearch/FBGEMM/pull/1728

In this diff we enable the support for the new FBGEMM backed FP8 _scaled_grouped_mm on ROCm. For now we only enable support for `gfx942` as that is what we have thoroughly tested performance and correctness on.

Rollback Plan:

Differential Revision: D79564024

Test Plan:

Ensure builds with:
- `USE_FBGEMM_GENAI=1` and without gfx942
- `USE_FBGEMM_GENAI=1` and with gfx942
- `USE_FBGEMM_GENAI=1` and all current [`PYTORCH_ROCM_ARCH`](9491d289b3/.ci/docker/libtorch/build.sh (L48))

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160676
Approved by: https://github.com/drisspg
This commit is contained in:
Chris Thi
2025-09-04 07:13:17 +00:00
committed by PyTorch MergeBot
parent 890626632d
commit 69a25f6888
3 changed files with 19 additions and 10 deletions

View File

@ -58,8 +58,8 @@
# USE_FBGEMM=0
# disables the FBGEMM build
#
# USE_FBGEMM_GENAI=1
# enables the FBGEMM GenAI kernels to build
# USE_FBGEMM_GENAI=0
# disables the FBGEMM GenAI build
#
# USE_KINETO=0
# disables usage of libkineto library for profiling