Files
pytorch/docs/source
eqy 790763b0fe Add an option to disable reduced precision reductions for FP16 GEMM (#67946)
Summary:
https://github.com/pytorch/pytorch/issues/67578 disabled reduced precision reductions for FP16 GEMMs. After benchmarking, we've found that this has substantial performance impacts for common GEMM shapes (e.g., those found in popular instantiations of multiheaded-attention) on architectures such as Volta. As these performance regressions may come as a surprise to current users, this PR adds a toggle to disable reduced precision reductions
`torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction = `
rather than making it the default behavior.

CC ngimel ptrblck
stas00 Note that the behavior after the previous PR can be replicated with
`torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction = False`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67946

Reviewed By: zou3519

Differential Revision: D32289896

Pulled By: ngimel

fbshipit-source-id: a1ea2918b77e27a7d9b391e030417802a0174abe
2021-11-09 17:27:20 -08:00
..
2021-09-03 06:12:24 -07:00
2020-11-29 23:57:06 -08:00
2020-05-14 16:02:01 -07:00
2021-03-30 13:09:52 -07:00
2021-10-07 12:48:36 -07:00
2021-11-09 14:00:45 -08:00
2021-10-19 09:07:15 -07:00
2021-05-20 10:32:18 -07:00
2021-03-30 13:09:52 -07:00
2021-10-30 15:26:11 -07:00