Optimize SiLU (Swish) op in PyTorch (#42976)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42976

Optimize SiLU (Swish) op in PyTorch.

Some benchmark result

input = torch.rand(1024, 32768, dtype=torch.float, device="cpu")
forward: 221ms -> 133ms
backward: 600ms -> 170ms

input = torch.rand(1024, 32768, dtype=torch.double, device="cpu")
forward: 479ms -> 297ms
backward: 1438ms -> 387ms

input = torch.rand(8192, 32768, dtype=torch.float, device="cuda")
forward: 24.34ms -> 9.83ms
backward: 97.05ms -> 29.03ms

input = torch.rand(4096, 32768, dtype=torch.double, device="cuda")
forward: 44.24ms -> 30.15ms
backward: 126.21ms -> 49.68ms

Test Plan: buck test mode/dev-nosan //caffe2/test:nn -- "SiLU"

Reviewed By: houseroad

Differential Revision: D23093593

fbshipit-source-id: 1ba7b95d5926c4527216ed211a5ff1cefa3d3bfd
This commit is contained in:
Xiaomeng Yang
2020-08-16 13:19:03 -07:00
committed by Facebook GitHub Bot
parent d4c5f561ec
commit 4ae832e106
14 changed files with 192 additions and 28 deletions

View File

@ -57,6 +57,7 @@ torch::nn::RReLU|Yes|No
torch::nn::SELU|Yes|No
torch::nn::CELU|Yes|No
torch::nn::GELU|Yes|No
torch::nn::SiLU|Yes|No
torch::nn::Sigmoid|Yes|No
torch::nn::Softplus|Yes|No
torch::nn::Softshrink|Yes|No
@ -183,6 +184,7 @@ F::prelu|Yes|No
F::rrelu|Yes|No
F::glu|Yes|No
F::gelu|Yes|No
F::silu|Yes|No
F::logsigmoid|Yes|No
F::hardshrink|Yes|No
F::tanhshrink|Yes|No