pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Files

yanbing-j 4567d5ded4 Upgrade oneDNN to v2.5.2 (#71546 )

Summary:
This PR upgrades oneDNN to v2.5.2, and includes some building support for oneDNN v2.5.2.

v2.4 changes:
- Improved performance for future Intel Xeon Scalable processor (code name Sapphire Rapids). The functionality is disabled by default and should be enabled via CPU dispatcher control.
- Improved binary primitive performance for cases when one of the tensors is broadcasted.
- Improved performance of reduction primitive, reorder, shuffle primitives.
- Improved performance of depthwise convolution forward propagation for processors with Intel AVX5-12 support
- Improved performance of forward inner product primitive for the shapes with minibatch equal to 1 for processors with Intel AVX-512 support
- Improved performance of int8 matmul and inner product primitives for processors with Intel AVX2 and Intel DL Boost support

v2.5 changes:
- Improved performance for future Intel Xeon Scalable processors (code name Sapphire Rapids). The functionality is now enabled by default and requires Linux kernel 5.16.
- Improved performance of matmul primitive for processors with Intel AVX-512 support.

v2.5.2 changes:
- Fixed performance regression in binary primitive with broadcast
- Fixed segmentation fault in depthwise convolution primitive for shapes with huge spatial size for processors with Intel AVX-512 support

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71546

Reviewed By: george-qi

Differential Revision: D33827108

Pulled By: VitalyFedyunin

fbshipit-source-id: 8f5a19b331c82af5b0783f081e061e1034a93952
(cherry picked from commit 9705212fe9b7b0838cc010d040c37d1175be83ce)

2022-02-01 18:34:58 +00:00

External

Disable ccache for nccl builds (#62208 )

2021-07-27 08:07:26 -07:00

Modules

Upgrade oneDNN to v2.5.2 (#71546 )

2022-02-01 18:34:58 +00:00

Modules_CUDA_fix

Lint trailing newlines (#54737 )

2021-03-30 13:09:52 -07:00

public

[ROCm] Refactor how to specify AMD gpu targets using PYTORCH_ROCM_ARCH (#61706 )