[Quant] lower fused LinearTanh for onednn backend (#89188)

**Summary**
Add fuser method and quantization mappings for `QLinearLeakyReLU` for int8 inference for onednn backend. The fusion and lowering are supported only in FX mode.

**Test plan**
python test_quantization.py TestFuseFx TestQuantizeFx

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89188
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
This commit is contained in:
Xia, Weiwen
2022-12-18 10:49:24 +08:00
committed by PyTorch MergeBot
parent 666d218055
commit a5eb564ba4
7 changed files with 167 additions and 63 deletions

View File

@ -1406,6 +1406,20 @@ class LinearBnLeakyReluModel(torch.nn.Module):
def get_example_inputs(self) -> Tuple[Any, ...]:
return (torch.rand(1, 5),)
class LinearTanhModel(torch.nn.Module):
def __init__(self):
super().__init__()
self.linear = nn.Linear(5, 5)
self.tanh = nn.Tanh()
def forward(self, x):
x = self.linear(x)
x = self.tanh(x)
return x
def get_example_inputs(self) -> Tuple[Any, ...]:
return (torch.rand(1, 5),)
# TODO: self.fc should be self.conv
class ConvReluModel(torch.nn.Module):
def __init__(self):