[Quant] lower fused LinearTanh for onednn backend (#89188)

**Summary** Add fuser method and quantization mappings for `QLinearLeakyReLU` for int8 inference for onednn backend. The fusion and lowering are supported only in FX mode. **Test plan** python test_quantization.py TestFuseFx TestQuantizeFx Pull Request resolved: https://github.com/pytorch/pytorch/pull/89188 Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2025-10-21 05:34:18 +08:00 · 2022-12-18 10:49:24 +08:00
parent 666d218055
commit a5eb564ba4
7 changed files with 167 additions and 63 deletions
--- a/torch/testing/_internal/common_quantization.py
+++ b/torch/testing/_internal/common_quantization.py
@ -1406,6 +1406,20 @@ class LinearBnLeakyReluModel(torch.nn.Module):
    def get_example_inputs(self) -> Tuple[Any, ...]:
        return (torch.rand(1, 5),)

+class LinearTanhModel(torch.nn.Module):
+    def __init__(self):
+        super().__init__()
+        self.linear = nn.Linear(5, 5)
+        self.tanh = nn.Tanh()
+
+    def forward(self, x):
+        x = self.linear(x)
+        x = self.tanh(x)
+        return x
+
+    def get_example_inputs(self) -> Tuple[Any, ...]:
+        return (torch.rand(1, 5),)
+
 # TODO: self.fc should be self.conv
 class ConvReluModel(torch.nn.Module):
    def __init__(self):