842 Commits

Author SHA1 Message Date
2293ab4e53 [quant][graphmode][fx] Refactor convert for linear to use get_static_module_mapping and get_dynamic_module_mapping (#60151)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60151

Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D29188264

fbshipit-source-id: d2b77ffcf4b7446fc6c43248e43218092d2a6aea
2021-06-20 19:41:16 -07:00
47d727fe1b [quant][graphmode][fx] Produce conv reference static quant modules (#60138)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60138

Test Plan:
python test/test_quantization.py TestQuantizeFx

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D29184791

fbshipit-source-id: 971a40012dbba0cf687c62a3a4af9358513c253b
2021-06-20 19:25:45 -07:00
a029422cae [quant][graphmode][fx][refactor] Change the env map to add dtype as a key (#60054)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60054

Previously env in convert is Dict[str, Tuple[Node, torch.dtype]], that is, at a given time each node can only have one dtype,
this causes a problem for the following case:
```
class M(torch.nn.Module):
        def __init__(self):
            super().__init__()
            self.conv = nn.Conv2d(1, 1, 1)

        def forward(self, x):
            x = self.conv(x)
            x1 = x.expand_as(x)
            x2 = torch.add(x, x1)
            return x2

def forward(self, x):
    x = self.activation_post_process_0(x)
    x = self.conv(x)
    x = self.activation_post_process_1(x)
    x1 = x.expand_as(x)
    x1 = self.activation_post_process_2(x1)
    x2 = torch.add(x, x1)
    x2 = self.activation_post_process_3(x2)
    return x2

def forward(self, x):
    x = torch.quantize_per_tensor(x, ...)
    x = self.conv(x). # quantized conv
    x = torch.dequantize(x)
    x1 = x.expand_as(x)
    x1 = torch.quantize_per_tensor(x1, ...)
    # Error: x is dequantized
    x2 = torch.ops.quantized.add(x, x1)
    return x2

Currently we have a env that is a map from node name of the observed graph to the Node in the quantized graph, here the problem is that following a quantized operator conv, we have two operators, one is expecting float input (expand_as), the other is expecting quantized input (quantized add), and in the quantized graph, ideally, expand_as should consume the dequantized output, and quantized add should consume the quantized output:

quantized_conv - dequantize - expand_as
  \ ------- quantized_add

But currently in env, each node needs to either be quantized or not quantized. Therefore we will need to change env to include dtype as well:
env: Dict[str, Dict[dtype, Node]], e.g. {‘x’: {torch.float: dequantized_node, torch.quint8: quantized_node}}
And when we load from the env, we will need to provide the dtype of the Node that we want to load as well. We can have a separate pass to figure out this information for each node.
```

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D29149408

fbshipit-source-id: c9e4b7d65444ab6a6f573929bae1db5037629892
2021-06-18 13:31:43 -07:00
5a45103139 ns for fx: add API usage logging (#60103)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60103

Adds internal logging for NS for FX API usage.

Test Plan: CI

Reviewed By: jerryzh168

Differential Revision: D29166710

fbshipit-source-id: 2a1bf2f6038b0c6c5945b57b2db2de25c585a04a
2021-06-18 10:25:59 -07:00
d5988c5eca remove unused type: ignore directives (#60006)
Summary:
During development it is common practice to put `type: ignore` comments on lines that are correct, but `mypy` doesn't recognize this. This often stems from the fact, that the used `mypy` version wasn't able to handle the used pattern.

With every new release `mypy` gets better at handling complex code. In addition to fix all the previously accepted but now failing patterns, we should also revisit all `type: ignore` comments to see if they are still needed or not. Fortunately, we don't need to do it manually: by adding `warn_unused_ignores = True` to the configuration, `mypy` will error out in case it encounters an `type: ignore` that is no longer needed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/60006

Reviewed By: jbschlosser, malfet

Differential Revision: D29133237

Pulled By: albanD

fbshipit-source-id: 41e82edc5cd5affa7ccedad044b59b94dad4425a
2021-06-18 07:23:31 -07:00
c0b7c59e55 [quant] Equalization Observer modifications (#59953)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59953

The following modifications were made to the equalization
observers due to design changes:
- [InputEqualizationObserver] Replaced `calculate_qparams()` with
`calculate_scaled_minmax()` since we will need to return the scaled
min/max values to update the following input quantization observer
- [WeightEqualizationObserver] We no longer need a row observer since
this will be taken care of by the following weight quantization observer
- [WeightEqualizationObserver] Following the previous comment, we no
longer need to calculate the scaled qparam values. Instead, we will use
the equalization scale to later scale the weights and the qparams will
be taken care of by the weight quantization observer.

Test Plan:
`python test/test_quantization.py
TestEqualizeFx.test_input_weight_eq_observer`

Imported from OSS

Reviewed By: supriyar

Differential Revision: D29135332

fbshipit-source-id: be7e468273c8b62fc183b1e1ec50f6bd6d8cf831
2021-06-16 22:32:30 -07:00
45c31cabb5 [quant] Input Weight Equalization - prepare modifications (#59747)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59747

Modifies prepare_fx for input-weight equalization. If a current
node is being equalized (there exists a EqualizationQConfig), then the
EqualizationObserver will be inserted before its quantization observer.

For a singular linear layer, the general flow looks like:
Original graph: `x0 -> linear -> x1`, `w -> linear`
After prepare: `x0 -> InpEqObs -> MinMaxObs -> linear1 -> MinMaxObs -> x1`
  `w -> WeightEqObs -> MinMaxObs -> linear1`

For two connected linear layers, the general flow looks like:
Original graph: `x0 -> linear1 -> linear2 -> x1`,
  `w1 -> linear1`, `w2 -> linear2`
After prepare: `x0 -> InpEqObs -> MinMaxObs -> linear1 -> MinMaxObs -> InpEqObs -> linear2 -> MinMaxObs -> x1`
  `w1 -> WeightEqObs -> MinMaxObs -> linear1`, `w2 -> WeightEqObs -> MinMaxObs -> linear2

Test Plan:
`python test/test_quantization.py
TestEqualizeFx.test_input_equalization_prepare`

Original model with one `nn.Linear` layer
```
LinearModule(
  (linear): Linear(in_features=1, out_features=1, bias=True)
)
```

Graph after `prepare_fx`:
```
graph():
    %x : [#users=1] = placeholder[target=x]
    %x_equalization_process_0 : [#users=1] = call_module[target=x_equalization_process_0](args = (%x,), kwargs = {})
    %x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_00](args = (%x_equalization_process_0,), kwargs = {})
    %linear : [#users=1] = call_module[target=linear](args = (%x_activation_post_process_0,), kwargs = {})
    %linear_activation_post_process_0 : [#users=1] = call_module[target=linear_activation_post_process_0](args = (%linear,), kwargs = {})
    return linear_activation_post_process_0
```
--------------------------------------

Original model with two connected functional linear layers
```
FunctionalLinearModule(
  (linear1): Linear()
  (linear2): Linear()
)
```

Graph after `prepare_fx`:
```
graph():
    %x : [#users=1] = placeholder[target=x]
    %x_equalization_process_0 : [#users=1] = call_module[target=x_equalization_process_0](args = (%x,), kwargs = {})
    %x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_00](args = (%x_equalization_process_0,), kwargs = {})
    %linear1_w : [#users=1] = get_attr[target=linear1.w]
    %linear1_w_equalization_process_0 : [#users=1] = call_module[target=linear1_w_equalization_process_0](args = (%linear1_w,), kwargs = {})
    %linear1_w_activation_post_process_0 : [#users=1] = call_module[target=linear1_w_activation_post_process_00](args = (%linear1_w_equalization_process_0,), kwargs = {})
    %linear1_b : [#users=1] = get_attr[target=linear1.b]
    %linear : [#users=1] = call_function[target=torch.nn.functional.linear](args = (%x_activation_post_process_0, %linear1_w_activation_post_process_0), kwargs = {bias: %linear1_b})
    %linear_activation_post_process_0 : [#users=1] = call_module[target=linear_activation_post_process_0](args = (%linear,), kwargs = {})
    %linear_activation_post_process_0_equalization_process_0 : [#users=1] = call_module[target=linear_activation_post_process_0_equalization_process_0](args = (%linear_activation_post_process_0,), kwargs = {})
    %linear2_w : [#users=1] = get_attr[target=linear2.w]
    %linear2_w_equalization_process_0 : [#users=1] = call_module[target=linear2_w_equalization_process_0](args = (%linear2_w,), kwargs = {})
    %linear2_w_activation_post_process_0 : [#users=1] = call_module[target=linear2_w_activation_post_process_00](args = (%linear2_w_equalization_process_0,), kwargs = {})
    %linear2_b : [#users=1] = get_attr[target=linear2.b]
    %linear_1 : [#users=1] = call_function[target=torch.nn.functional.linear](args = (%linear_activation_post_process_0_equalization_process_0, %linear2_w_activation_post_process_0), kwargs = {bias: %linear2_b})
    %linear_1_activation_post_process_0 : [#users=1] = call_module[target=linear_1_activation_post_process_0](args = (%linear_1,), kwargs = {})
    return linear_1_activation_post_process_0
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D29135316

fbshipit-source-id: 91697e805ede254dbb2a42ee4c23eb1c1c64590e
2021-06-16 22:32:28 -07:00
7ce74f3339 [quant] EqualizationQConfig to distinguish input/output activations (#59739)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59739

Created an EqualizationQConfig specifically for equalization.
This inherits from QConfig and is used to distinguish between inserting
an input observer with an output observer. Since the output observer
field is included in the EqualizationQConfig, we no longer need an
output observer field in the _InputEqualizationObserver

Test Plan:
compiles

Imported from OSS

Reviewed By: ezyang

Differential Revision: D29135298

fbshipit-source-id: 3dde9c029c291467ff0a0845f0fc9c44573fc6f6
2021-06-16 22:31:18 -07:00
a344b09db2 [quant][fx][graphmode] Remove Quantizer class (#59606)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59606

Test Plan:
python test/test_quantization.py TestQuantizeFx

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D28951432

fbshipit-source-id: 3301f7200a4c7166673c27f9ac7ff559f1e6935d
2021-06-15 21:54:57 -07:00
864d129bae [quant][fx] Remove extra q-dq for weight bias in normalization ops (#59882)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59882

Currently for normalization ops, the weight and bias arguments are treated as activationn inputs which require observers.
This results in adding extra quant-dequant ops for the weight and bias inputs.

This PR adds support to skip observing weight/bias inputs of norm operators, thus removing the redundant q-dq ops

Quantized graph with F.layer_norm
Before this PR
```
def forward(self, x):
    _input_scale_0 = self._input_scale_0
    _input_zero_point_0 = self._input_zero_point_0
    quantize_per_tensor = torch.quantize_per_tensor(x, _input_scale_0, _input_zero_point_0, torch.quint8);  x = _input_scale_0 = _input_zero_point_0 = None
    scale = self.scale
    _input_scale_1 = self._input_scale_1
    _input_zero_point_1 = self._input_zero_point_1
    quantize_per_tensor_1 = torch.quantize_per_tensor(scale, _input_scale_1, _input_zero_point_1, torch.quint8);  scale = _input_scale_1 = _input_zero_point_1 = None
    bias = self.bias
    _input_scale_2 = self._input_scale_2
    _input_zero_point_2 = self._input_zero_point_2
    quantize_per_tensor_2 = torch.quantize_per_tensor(bias, _input_scale_2, _input_zero_point_2, torch.quint8);  bias = _input_scale_2 = _input_zero_point_2 = None
    _scale_0 = self._scale_0
    _zero_point_0 = self._zero_point_0
    dequantize = quantize_per_tensor_1.dequantize();  quantize_per_tensor_1 = None
    dequantize_1 = quantize_per_tensor_2.dequantize();  quantize_per_tensor_2 = None
    layer_norm = torch.ops.quantized.layer_norm(quantize_per_tensor, [2, 5, 5], weight = dequantize, bias = dequantize_1, eps = 1e-05, output_scale = _scale_0, output_zero_point = _zero_point_0);  quantize_per_tensor = dequantize = dequantize_1 = _scale_0 = _zero_point_0 = None
    dequantize_2 = layer_norm.dequantize();  layer_norm = None
    return dequantize_2
```
After
```
def forward(self, x):
    _input_scale_0 = self._input_scale_0
    _input_zero_point_0 = self._input_zero_point_0
    quantize_per_tensor = torch.quantize_per_tensor(x, _input_scale_0, _input_zero_point_0, torch.quint8);  x = _input_scale_0 = _input_zero_point_0 = None
    scale = self.scale
    bias = self.bias
    _scale_0 = self._scale_0
    _zero_point_0 = self._zero_point_0
    layer_norm = torch.ops.quantized.layer_norm(quantize_per_tensor, [2, 5, 5], weight = scale, bias = bias, eps = 1e-05, output_scale = _scale_0, output_zero_point = _zero_point_0);  quantize_per_tensor = scale = bias = _scale_0 = _zero_point_0 = None
    dequantize = layer_norm.dequantize();  layer_norm = None
    return dequantize
```

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_norm_weight_bias

Imported from OSS

Reviewed By: HDCharles, ailzhang

Differential Revision: D29068203

fbshipit-source-id: 24b5c38bbea5fd355d34522bfa654c9db18607da
2021-06-11 16:22:36 -07:00
d75e99b709 fx quant: enable qconfig_dict to target function invocations by order (#59605)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59605

Enables targeting of individual function invocations by execution order.
For example, given a module such as

```
class M1(torch.nn.Module):
  def forward(self, x):
    x = torch.add(x, x)
    x = torch.add(x, x)
    return x

class M2(torch.nn.Module):
  def __init__(self):
    self.m1 = M1()

  def forward(self, x):
    x = self.m1(x)
    return x
```

We can now target the first add of `m1` with

```
qconfig_dict = {
  "module_name_function_order": ("m1", torch.add, 0, custom_qconfig),
}
```

Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_qconfig_module_name_function_order
```

Imported from OSS

Reviewed By: hx89

Differential Revision: D28951077

fbshipit-source-id: 311d423724a31193d4fa4bbf3a712b46464b5a29
2021-06-11 08:53:40 -07:00
0099c25b85 fx quant: remove some dead code in observer insertion (redo) (#59799)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59799

This is a redo of #58574, easier to create a new PR than to fix rebase
conflicts, as there have been a large number of refactors to the
underlying code.

Removes some code which was incorrectly added by #57519 but never
actually used for anything.

Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D29031955

fbshipit-source-id: f407d181070cb283382965952821e3647c705544
2021-06-10 12:57:09 -07:00
61965abad7 Move _PartialWrapper to module scope (#59660)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59660

Context https://github.com/pytorch/pytorch/issues/57352

Test Plan: Pytorch CI tests

Reviewed By: vkuzo

Differential Revision: D28972991

fbshipit-source-id: efc9dd3e90e18e1cdf27d5ef0f168abd8169bc42
2021-06-09 11:55:04 -07:00
7dac2987ce [quant][eager][fix] Fix a typo in convert function in eager mode quantization (#59571)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59571

Test Plan:
python test/test_quantization.py TestPostTrainingStatic.test_custom_module_class

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D28938355

fbshipit-source-id: 566daeb07d616ae40e52754d3d4581f75f248f04
2021-06-08 10:24:22 -07:00
cc03ea2c47 [quant] Implemented InputWeightObserver for Linear inputs
Summary: Implemented two observers (InputEqualObserver and WeightEqualObserver) which will be inserted into the graph during prepare_fx().

Test Plan: python test/test_quantization.py TestEqualizeFx

Reviewed By: supriyar

Differential Revision: D28836954

fbshipit-source-id: 25517dc82ae67698ed8b2dc334e3323286976104
2021-06-07 11:19:43 -07:00
1aa14fcb14 Fix the "tensors to be on the same device" error in HistogramObserver (#59234)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/59075

This PR fixes the "tensors to be on the same device" error in `HistogramObserver`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59234

Reviewed By: jbschlosser

Differential Revision: D28837572

Pulled By: vkuzo

fbshipit-source-id: ff7c3229ced7de2cdd8f76d526f0fd33ac643216
2021-06-03 13:30:56 -07:00
18642e664a [quant][graphmode][fx][refactor] Split quantize.py to prepare.py and convert.py (#59353)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59353

Next: remove Quantizer class

Test Plan: Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D28856277

fbshipit-source-id: 25f5502be387dbe9706780f667501b46b82789a5
2021-06-02 23:52:39 -07:00
87a25e09f4 [quant][graphmode][fx][refactor] Remove _convert from Quantizer class (#59042)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59042

To remove Quantizer class and split prepare and convert functions to different files

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D28724867

fbshipit-source-id: 9f87d51020caa20d5408cb2820947e23d92d5fc3
2021-06-02 08:50:56 -07:00
3218d890dd [quant][graphmode][fx][fix] Fix support for custom module (#59041)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59041

Static quantization for Custom module support was removed in a previous refactor
https://github.com/pytorch/pytorch/pull/57519 since it's not covered by the test case
This PR re-enabled the test case and fixed the support

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D28724866

fbshipit-source-id: 1974675b88b56a2173daf86965d6f3fb7ebd783b
2021-06-01 22:31:15 -07:00
06af7618e7 [quant][graphmode][fx][refactor] Remove Quantizer class from convert (QuantizeHandler) (#59040)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59040

To remove Quantizer class and split prepare and convert functions to different files

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D28724870

fbshipit-source-id: c0f748711b825cd46bdfcc05c054c77a41e8207a
2021-06-01 22:00:49 -07:00
50e6ee3ca2 [quant][graphmode][fx][refactor] Remove Quantizer class from quantize_node (#59039)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59039

To remove Quantizer class and split prepare and convert functions to different files

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D28724874

fbshipit-source-id: bd984716b2da1d6879c3e92fa827574783a41567
2021-06-01 21:40:08 -07:00
1d37f41567 [quant][graphmode][fx][refactor] Remove _prepare from Quantizer class (#59038)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59038

To remove Quantizer class and split prepare and convert functions to different files

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D28724869

fbshipit-source-id: e8501c9720b5ddb654e78bc8fa08de0466c1d52b
2021-06-01 18:01:22 -07:00
20348fb32e [quant][graphmode][fx][refactor] Remove find_matches from Quantizer class (#59037)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59037

To remove Quantizer class and split prepare and convert functions to different files

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D28724865

fbshipit-source-id: 6c6824d0af7dd47d4c111d6a08e373bc65f33e08
2021-06-01 16:07:07 -07:00
7d64fc675b [quant][graphmode][fx][refactor] Remove fold_weights from Quantizer class (#59036)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59036

To remove Quantizer class and split prepare and convert functions to different files

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D28724862

fbshipit-source-id: 5900420127fcc14846bc34c9ac29ff7e6a703f1e
2021-06-01 15:52:57 -07:00
cc4891804c [quant][graphmode][fx][refactor] Remove save_state and restore_state from Quantizer class (#59035)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59035

To remove Quantizer class and split prepare and convert functions to different files

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D28724872

fbshipit-source-id: d32752c635917c9820e5e7cc414ba9d48a258a19
2021-06-01 15:38:36 -07:00
3d521e8b40 [quant][graphmode][fx][refactor] Remove prepare_custom_config from Quantizer class (#59034)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59034

To remove Quantizer class and split prepare and convert functions to different files

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D28724873

fbshipit-source-id: 870e0822843ad1d035f41eaa015bdde9ccf6ec23
2021-06-01 14:52:22 -07:00
e4b2684331 [quant][graphmode][fx][refactor] Remove patterns from Quantizer class (#59033)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59033

To remove Quantizer class and split prepare and convert functions to different files

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D28724861

fbshipit-source-id: 97b38e851b6bf581510a24636b1d8d6f1d977f5a
2021-06-01 13:44:08 -07:00
83892c1861 [quant][graphmode][fx][refactor] Remove node_name_to_scope from Quantizer (#59032)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59032

To remove Quantizer class and split prepare and convert functions to different files

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D28724868

fbshipit-source-id: 6df639f20076b480812b6dcf0fc7d2c87ca29d8b
2021-06-01 13:26:09 -07:00
3826f7e8e0 [quant][graphmode][fx][refactor] Remove quantized_graph from Quantizer (#59031)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59031

Trying to remove Quantizer class and split prepare and convert code

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D28724871

fbshipit-source-id: dad0332ba271c4cfb6ec1e8f2036443149b5bea4
2021-06-01 13:01:54 -07:00
1b4586ee20 [quant][gx][graphmode][refactor] Remove modules from Quantizer (#59030)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59030

Trying to remove Quantizer class and split prepare and convert code

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D28724875

fbshipit-source-id: d6610c1d5eb7755331252be9e348a230abf4175c
2021-06-01 12:42:28 -07:00
7523728368 [quant][graphmode][fx] Factor out run_weight_observer (#59029)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59029

Trying to remove Quantizer class and split prepare and convert code

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D28724864

fbshipit-source-id: 67ac5e7eb351970fdf46532c3c2ac6ac831bc697
2021-06-01 10:01:42 -07:00
10fc42eacc [quant][graphmode][fx] Merge quant_env and env (#59028)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59028

Previously we have an env and a quant_env in convert, which is a bit confusing,
in this PR we merged them and have a Dict[str, Tuple[Node, torch.dtype]]

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D28724863

fbshipit-source-id: 722a682c70d300a6ccd2b988786a1ac2d45e880e
2021-06-01 09:21:38 -07:00
a1806134a7 [QAT] Fix the runtime run cannot resize variables that require grad (#57068)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57068

When training with histogram observer on, we got this runtime error:
```
torch/quantization/observer.py", line 942, in forward
                    self.bins)

            self.histogram.resize_(combined_histogram.shape)
            ~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
            self.histogram.copy_(combined_histogram)
            self.min_val.resize_(combined_min.shape)
RuntimeError: cannot resize variables that require grad
```

Since this is the histogram observer that is used to collect histogram information, should not need gradient. So turn off the grad before resizing using `detach_()` method.

Test Plan:
- arc lint
- Train with histogram observer turned on, training finished successfully

f264139727

Reviewed By: supriyar

Differential Revision: D27147212

fbshipit-source-id: abed5b9c4570ffc6bb60e58e64791cfce66856cd
2021-05-27 09:12:06 -07:00
25ac647f64 [QAT] Auto format the torch/quantization/observer.py` (#57067)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57067

auto format the code

Test Plan: lint

Reviewed By: jerryzh168

Differential Revision: D27147213

fbshipit-source-id: 008871d276c8891b2411549e17617e5c27d16ee3
2021-05-27 09:10:34 -07:00
58d1b3639b fix nn.MHA scriptability (#58727)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58727

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D28593830

Pulled By: bhosmer

fbshipit-source-id: 37dee9efededaea9985a2bf040df1ba4b46f6580
2021-05-26 15:29:49 -07:00
09a8f22bf9 Add mish activation function (#58648)
Summary:
See issus: https://github.com/pytorch/pytorch/issues/58375

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58648

Reviewed By: gchanan

Differential Revision: D28625390

Pulled By: jbschlosser

fbshipit-source-id: 23ea2eb7d5b3dc89c6809ff6581b90ee742149f4
2021-05-25 10:36:21 -07:00
de845020a0 fix docstring for fusing functions (#58638)
Summary:
This PR fixes docstrings of fusing functions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58638

Reviewed By: H-Huang

Differential Revision: D28584501

Pulled By: jerryzh168

fbshipit-source-id: 77a53a709d968df8ba8f5b613ad7cf225ba2826a
2021-05-24 18:27:22 -07:00
a5250425e0 [quant] Eager mode equalization support for ConvReLU and LinearReLU (#58792)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58792

Enabling support for fused modules like ConvReLU or LinearReLU on eager mode cross-layer equalization.

Test Plan:
`python test/test_quantization.py TestEqualizeEager`

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D28647242

fbshipit-source-id: 286e057ce70aa7de45d575afd6c13e55120ff18a
2021-05-24 17:25:13 -07:00
f29e75c4dc [reland][quant][fx][graphmode][refactor] Remove qconfig_map from Quantizer (#58455) (#58756)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58756

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Imported from OSS

Reviewed By: supriyar

Differential Revision: D28607564

fbshipit-source-id: 979cf165941bb3a9044d03077a170b5ea64dc36a
2021-05-24 14:57:45 -07:00
e574c2c025 [quant][fx] Validate qconfig_dict keys (#58566)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58566

Validates the keys of the qconfig_dict, prepare_custom_config_dict, convert_custom_config_dict, and
fuse_custom_config_dict. If the user passes in an invalid key or makes a type, we will throw and error and let the user know what keys are supported.

Test Plan:
Imported from OSS

python test/test_quantization.py

Reviewed By: jerryzh168

Differential Revision: D28540923

fbshipit-source-id: 5958c32017b7d16abd219aefc8e92c42543897c2
2021-05-21 15:20:05 -07:00
21a9334034 Revert D28497967: [quant][fx][graphmode][refactor] Remove qconfig_map from Quantizer
Test Plan: revert-hammer

Differential Revision:
D28497967 (1cf8f7a439)

Original commit changeset: 421ce3d86fad

fbshipit-source-id: b1b290be47d847ab0e0128e3ae89f528578550ee
2021-05-20 20:56:12 -07:00
1cf8f7a439 [quant][fx][graphmode][refactor] Remove qconfig_map from Quantizer (#58455)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58455

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D28497967

fbshipit-source-id: 421ce3d86fadd3d92f4120b850b0167270509189
2021-05-20 20:34:47 -07:00
b6dcdeacc9 [quant][graphmode][fx] Move qat_swap_modules outside of Quantizer (#58454)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58454

Trying to remove Quantizer in the end

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D28497966

fbshipit-source-id: 800f8e4afd99918d7330345f8ae7bcf018a5bde7
2021-05-20 17:27:49 -07:00
618be18a41 Enable the quantization on XPU devices (#54857)
Summary:
Enable the quantization on XPU devices. Keep the model as is if the model is on XPU devices.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/54857

Reviewed By: ailzhang

Differential Revision: D28501381

Pulled By: jerryzh168

fbshipit-source-id: 6d3e9b04075393248b30776c69881f957a1a837c
2021-05-20 17:02:13 -07:00
f879e70fc1 [quant][fx][graphmode][refactor] Factor out generate_qconfig_map to qconfig_utils.py (#58453)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58453

Move the class method generate_qconfig_map to qconfig_utils, will add more PRs
to remove functions out of Quantizer and eventually remove the Quantizer object

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D28497965

fbshipit-source-id: 3c78cfe676965d20a8834a859ffed4d8e9ecade4
2021-05-20 16:26:24 -07:00
4668d09ca6 [quant][graphmode][fx] Quantize the output of statically quantized fp16 op in QuantizeHandler (#58445)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58445

Previously the output of statically quantized fp16 operator is not quantized in QuantizeHandler, which is not consistent with
the behavior of static int8 operators. Also it does not work well with reference functions, this PR
changes the fp16 static QuantizeHandler to quantize (call to(torch.float16)) in the QuantizeHandler, this also
makes the future support for reference functions easier.

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D28495830

fbshipit-source-id: 2140eab8ab2dd08f6570d9e305485e3029e1f47d
2021-05-20 16:03:42 -07:00
07da584dbd Fix KeyError returned by _maybe_get_last_node_only_observer (#58443)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58443

Test Plan: arc lint

Reviewed By: vkuzo

Differential Revision: D28494119

fbshipit-source-id: 05abf4e12051afc237096812fb0ee08a8b9447f9
2021-05-18 12:41:19 -07:00
821a97595b fx quant: improve performance of all_node_args_have_no_tensors (#58461)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58461

Improves the logic which calculates whether a node has any tensors
in its arguments by terminating the recursion early when possible.

In a future PR, we should probably ditch this entire approach and switch to
using dtype propagation.

Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D28499455

fbshipit-source-id: bedd844022b90e1fcb7d7a3cb4cc65440dc9cc59
2021-05-18 07:19:59 -07:00
7b73fdf597 [FX] Fix retracing wrapped functions (#58061)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58061

Test Plan: Imported from OSS

Reviewed By: yuhc

Differential Revision: D28358801

Pulled By: jamesr66a

fbshipit-source-id: c7c9a8a80e5bfe1eb1f6d2cf858ac7e57153a860
2021-05-17 19:50:16 -07:00
c156a4ffaa fx quant: fix crash on output dicts and lists (#58416)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58416

https://github.com/pytorch/pytorch/pull/57519 had a regression not
caught by CI, it added an assertion which failed on various model
output types.

This PR removes the assertion and adds the logic to observe graph
outputs in a way that supports arbitrary output formats.

Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_output_lists_and_dicts
```

Imported from OSS

Reviewed By: z-a-f

Differential Revision: D28479946

fbshipit-source-id: bcce301f98a057b134c0cd34ab0ca96ba457863f
2021-05-17 15:02:09 -07:00