[Intel GPU] Support RegisterXPU.cpp codegen and compile for the in-tree XPU structured GEMM OPs. (#139025)

[Intel GPU] Support RegisterXPU.cpp codegen and compile for the in-tree XPU structured GEMM ops.

Motivation: There are two parts of aten ops for XPU, one is in-tree ops like GEMM related OPs and the other is out-off-tree ops in torch-xpu-ops. For the in-tree part,since Pytorch uses native_functions.yaml registration and is equipped with convenient codegen capabilities, we want to take advantage of these benefits as well.
At the same time, since AOT Inductor also uses native_functions.yaml to generate c shim wrappers, we also need to enable this mechanism for XPU.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139025
Approved by: https://github.com/EikanWang, https://github.com/jansel, https://github.com/desertfire
This commit is contained in:
xinan.lin
2024-11-08 18:04:34 -08:00
committed by PyTorch MergeBot
parent 0b650c360a
commit 929a647363
9 changed files with 157 additions and 47 deletions

View File

@ -346,6 +346,9 @@ def is_ufunc_dispatch_key(dk: DispatchKey) -> bool:
return dk in UFUNC_DISPATCH_KEYS
dispatch_device_map = {is_cuda_dispatch_key: "cuda", is_xpu_dispatch_key: "xpu"}
# This is oddly named ScalarType and not DType for symmetry with C++
class ScalarType(Enum):
Byte = auto()