mirror of
https://github.com/pytorch/pytorch.git
synced 2025-10-20 21:14:14 +08:00
[ROCm] Enable USE_FBGEMM_GENAI (#160676)
Summary:
X-link: https://github.com/pytorch/FBGEMM/pull/4703
X-link: https://github.com/facebookresearch/FBGEMM/pull/1728
In this diff we enable the support for the new FBGEMM backed FP8 _scaled_grouped_mm on ROCm. For now we only enable support for `gfx942` as that is what we have thoroughly tested performance and correctness on.
Rollback Plan:
Differential Revision: D79564024
Test Plan:
Ensure builds with:
- `USE_FBGEMM_GENAI=1` and without gfx942
- `USE_FBGEMM_GENAI=1` and with gfx942
- `USE_FBGEMM_GENAI=1` and all current [`PYTORCH_ROCM_ARCH`](9491d289b3/.ci/docker/libtorch/build.sh (L48)
)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/160676
Approved by: https://github.com/drisspg
This commit is contained in:
committed by
PyTorch MergeBot
parent
890626632d
commit
69a25f6888
@ -880,10 +880,15 @@ cmake_dependent_option(
|
||||
USE_FBGEMM_GENAI
|
||||
"Whether to build FBGEMM GenAI quantized GEMM kernels.\
|
||||
Will be disabled if not supported by the platform"
|
||||
OFF
|
||||
"USE_CUDA OR USE_ROCM"
|
||||
ON
|
||||
"USE_ROCM"
|
||||
OFF)
|
||||
|
||||
IF(USE_FBGEMM_GENAI AND USE_ROCM AND NOT "gfx942" IN_LIST PYTORCH_ROCM_ARCH)
|
||||
message(WARNING "Unsupported ROCM arch for FBGEMM GenAI, will set USE_FBGEMM_GENAI to OFF")
|
||||
set(USE_FBGEMM_GENAI off)
|
||||
endif()
|
||||
|
||||
# CAVEAT: Again, Flash Attention2 will error while building for sm52 while Mem
|
||||
# Eff Attention won't
|
||||
cmake_dependent_option(
|
||||
|
Reference in New Issue
Block a user