pytorch/cmake at 202f83dc4ed9a2fcc7ea43fef61fbcad0c2ee987 - pytorch - Gitea: Git for Me

frozenleaves/pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 12:54:11 +08:00

Files

History

Jerry Mannil 202f83dc4e [ROCm][layer_norm] Use __builtin_amdgcn_rcpf(x) instead of 1.f/x (#165589 )

Replace (more) exact calculation with hardware approximation.

Benefits:
Reduced code size.
Improved performance for certain scenarios.

Experiments show low reduction in precision.
Experiments show no significant performance regressions. bfloat16 as well as float16 related calculations may benefit largely from this change.

Co-author: @mhalk @amd-hhashemi

Pull Request resolved: https://github.com/pytorch/pytorch/pull/165589
Approved by: https://github.com/jeffdaily

2025-10-17 09:12:30 +00:00

..

[ROCm][Windows] Enable AOTriton runtime compile on Windows (#165538 )

2025-10-16 19:51:43 +00:00

[Intel GPU] Upgrade OneDNN XPU Tag to v3.9.1 (#161932 )

2025-09-04 11:05:10 +00:00

Modules_CUDA_fix

[2/N] Remove FindPackageHandleStandardArgs.cmake (#156559 )

2025-07-24 02:34:10 +00:00

[CMake] Remove forcing of -O2 from torch_compile_options (#164894 )

2025-10-10 04:43:53 +00:00

Allowlist.cmake

Replace whitelist with allowlist (#42067 )

2020-07-28 08:01:16 -07:00

BLAS_ABI.cmake

[submodule] Bump fbgemm to latest (#158210 )

2025-08-11 13:48:02 +00:00

BuildVariables.cmake

Remove Caffe2_MAIN_LIBS (#38408 )

2020-05-15 12:27:15 -07:00

Caffe2Config.cmake.in

xpu: improve error handling and reporting in XPU cmake files (#149353 )

2025-03-20 02:00:39 +00:00

cmake_uninstall.cmake.in

Formatting cmake (to lowercase without space for if/elseif/else/endif) (#35521 )

2020-03-27 14:25:17 -07:00

Codegen.cmake

[ATen][CUDA] CUTLASS matmuls: add sm_103a flag (#162956 )

2025-09-16 10:29:55 +00:00

DebugHelper.cmake

[BE] fix typos in cmake/ (#156079 )

2025-06-17 19:25:43 +00:00

Dependencies.cmake

[ROCm][layer_norm] Use __builtin_amdgcn_rcpf(x) instead of 1.f/x (#165589 )

2025-10-17 09:12:30 +00:00

FlatBuffers.cmake

[pytorch][PR] Add ability for a mobile::Module to save as flatbuffer (#70201 )

2022-01-12 16:30:39 -08:00

IncludeSource.cpp.in

CMake: Include instead of copying cpu kernel files (#67656 )

2021-11-30 19:13:53 -08:00

iOS.cmake

[BE] fix typos in cmake/ (#156079 )

2025-06-17 19:25:43 +00:00

Metal.cmake

[Build] Allow metal shaders to include ATen headers (#156256 )

2025-06-18 01:03:25 +00:00

MiscCheck.cmake

[submodule] Bump fbgemm to latest (#158210 )

2025-08-11 13:48:02 +00:00

prioritized_text.txt

Revert "[BE] Remove HermeticPyObjectTLS and Simplify PythonOpRegistrationTrampoline (#163464 )"

2025-09-30 18:20:20 +00:00

ProtoBuf.cmake

[Reland] Use 3.27 as the minimum CMake version (#154783 )

2025-06-14 16:37:51 +00:00

ProtoBufPatch.cmake

Migrate PyTorch to C++17 (#85969 )

2022-12-08 02:27:48 +00:00

Summary.cmake

[ROCm][layer_norm] Use __builtin_amdgcn_rcpf(x) instead of 1.f/x (#165589 )

2025-10-17 09:12:30 +00:00

TorchConfig.cmake.in

Revert "Simplify nvtx3 CMake handling, always use nvtx3 (#153784 )"

2025-06-24 20:02:07 +00:00

TorchConfigVersion.cmake.in

Formatting cmake (to lowercase without space for if/elseif/else/endif) (#35521 )

2020-03-27 14:25:17 -07:00

VulkanCodegen.cmake

[BE][CMake] Use FindPython module (#124613 )

2024-05-29 13:17:35 +00:00

VulkanDependencies.cmake

[Vulkan] Remove GLSL Code Gen (#91912 )

2023-01-10 20:29:47 +00:00