5cedc5a0ff
[BE][PYFMT] migrate PYFMT for torch/[p-z]*/
to ruff format
( #144552 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144552
Approved by: https://github.com/ezyang
2025-08-07 00:09:56 +00:00
3bf922a6ce
Apply UFMT to low traffic torch modules ( #106249 )
...
Signed-off-by: Edward Z. Yang <ezyang@meta.com >
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106249
Approved by: https://github.com/Skylion007
2023-07-29 23:37:30 +00:00
aa40503954
Add Custom Module Support List ( #82606 )
...
Summary:
Add a global custon module support list for the users to specify the modules they want the equalization process support.
To use this list, import it from the _equalize.py file and append module in it.
Unittest passed to check global support list:
https://pxl.cl/28RKG
Test Plan: buck1 test mode/dev //on_device_ai/odai/tests/transforms:test_transforms -- --exact 'on_device_ai/odai/tests/transforms:test_transforms - test_custom_support_list (on_device_ai.odai.tests.transforms.test_input_weight_for_turing.TestInputWeight)'
Reviewed By: jerryzh168
Differential Revision: D38264244
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82606
Approved by: https://github.com/HDCharles
2022-08-03 17:48:51 +00:00
508845f2b5
[quant] AO migration of the torch/quantization/quantize_fx.py
and torch/quantization/fx/*
( #65033 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65033
1. Move the file:
```
hg mv caffe2/torch/quantization/fx caffe2/torch/ao/quantization/fx
hg mv caffe2/torch/quantization/quantize_fx.py caffe2/torch/ao/quantization/quantize_fx.py
```
2. Create new files
```
touch caffe2/torch/quantization/quantize_fx.py
touch caffe2/torch/quantization/fx/__init__.py
```
3. import things in the new files
4. add tests to test/quantization/ao_migration/test_quantization_fx.py
this is because we have some fx import in quantize_fx and fx/*.py
Test Plan: buck test mode/dev //caffe2/test:quantization
Reviewed By: vkuzo, z-a-f
Differential Revision: D30949749
fbshipit-source-id: 9e5d4d039c8a0a0820bc9040e224f0d2c26886d3
2021-09-22 09:29:15 -07:00
1577c106dc
torch.ao migration: numeric suite, eager and fx ( #64817 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64817
This migrates `torch.quantization._numeric_suite` to `torch.ao.ns._numeric_suite`, and `torch.quantization._numeric_suite_fx` to `torch.ao.ns._numeric_suite_fx`.
1. move the files
```
HG: move eager mode
hg mv caffe2/torch/quantization/_numeric_suite.py caffe2/torch/ao/ns/
HG: move fx
hg mv caffe2/torch/quantization/_numeric_suite_fx.py caffe2/torch/ao/ns/
hg mv caffe2/torch/quantization/ns/* caffe2/torch/ao/ns/fx/
```
2. create new versions of `_numeric_suite.py` and `_numeric_suite_fx.py` with
imports
3. update all FB callsites
Test Plan: buck test mode/dev //caffe2/test:quantization
Reviewed By: z-a-f
Differential Revision: D30867538
fbshipit-source-id: 120ee830434ca490c1183a187a518eebcbbaf22c
2021-09-12 12:00:45 -07:00
d9154b9b26
[quant] Input-Weight Equalization - allow logical evaluation ( #61603 )
...
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61603
Test Plan: Imported from OSS
Reviewed By: supriyar
Differential Revision: D29686878
fbshipit-source-id: 67ca4cab98b3d592ff2bb8db86499789b85bd582
2021-08-06 15:10:32 -07:00
836b2431dc
[quant] Input-Weight Equalization - selective equalization ( #61916 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61916
Functions used to run selective equalization based on the SQNR obtained from running the Numeric Suite. After running the Numeric Suite between the equalized and float model, we will get the SQNR between the two models and construct an equalization_qconfig_dict that specifies to only equalize the layers with the highest quantization errors.
How to run:
```
layer_to_sqnr_dict = get_layer_sqnr_dict(float_model, equalized_model, input)
eq_qconfig_dict = get_equalization_qconfig_dict(layer_to_sqnr_dict, equalized_model, num_layers_to_equalize)
prepared = prepare_fx(float_model, qconfig_dict, eq_qconfig_dict)
...
```
Test Plan:
`python test/test_quantization.py TestEqualizeFx.test_selective_equalization`
Imported from OSS
Reviewed By: supriyar
Differential Revision: D29796950
fbshipit-source-id: 91f0f8427d751beaea32d8ffc2f3b8aa8ef7ea95
2021-08-06 09:29:03 -07:00
cfd0f5ebc9
[quant] update per-channel observer min/max_val attribute names ( #62345 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62345
This PR updates the attribute names from min_vals to min_val. the motivation for this is to keep the attribute name consistent with per-tensor observers so that dependencies (like FusedMovingAvgObsFakeQuantize) don't need to differentiate between the two observer types to access the attributes.
It also adds some BC tests to make sure that observers saved earlier with min_vals/max_vals can be loaded depending on the state_dict version.
Note: Scriptability of the observers isn't fully supported yet, so we aren't testing for that in this PR.
Test Plan:
python test/test_quantization.py TestSerialization
Imported from OSS
Reviewed By: HDCharles
Differential Revision: D30003700
fbshipit-source-id: 20e673f1bb15e2b209551b6b9d5f8f3be3f85c0a
2021-07-29 22:28:53 -07:00
0751a41ab1
[quant] Input-Weight Equalization - ConvReLU support ( #61350 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61350
Applied changes in convert to allow for ConvReLU2d layers
Initial Model: `x -> conv1 -> relu`
After fusion: `x -> convRelu2d`
After prepare: `x -> input_quant_obs -> input_eq_obs1 -> convRelu2d -> output_quant_obs1`
After equalization functions: `x -> mul -> input_quant_obs (scaled) -> convRelu2d -> output_quant_obs`
After convert: `x -> mul -> quantize_per_tensor -> quantized::convRelu2d -> dequantize`
Test Plan:
`python test/test_quantization.py TestEqualizeFx`
Initial Model:
```
ConvReluModel(
(fc): Conv2d(3, 5, kernel_size=(3, 3), stride=(1, 1))
(relu): ReLU()
)
```
After prepare:
```
GraphModule(
(x_activation_post_process_0): MinMaxObserver(min_val=5.960464477539063e-08, max_val=0.9999999403953552)
(x_activation_post_process_0_equalization_process_0): _InputEqualizationObserver(
(input_obs): PerChannelMinMaxObserver(min_val=tensor([1.1921e-07, 3.3379e-06, 5.9605e-08]), max_val=tensor([1.0000, 1.0000, 1.0000]))
)
(fc): ConvReLU2d(
(0): Conv2d(3, 5, kernel_size=(3, 3), stride=(1, 1))
(1): ReLU()
)
(fc_activation_post_process_0): MinMaxObserver(min_val=0.0, max_val=1.2341605424880981)
)
graph():
%x : [#users=1] = placeholder[target=x]
%x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_0](args = (%x,), kwargs = {})
%x_activation_post_process_0_equalization_process_0 : [#users=1] = call_module[target=x_activation_post_process_0_equalization_process_0](args = (%x_activation_post_process_0,), kwargs = {})
%fc : [#users=1] = call_module[target=fc](args = (%x_activation_post_process_0_equalization_process_0,), kwargs = {})
%fc_activation_post_process_0 : [#users=1] = call_module[target=fc_activation_post_process_0](args = (%fc,), kwargs = {})
return fc_activation_post_process_0
```
After equalization functions:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_equalization_scale0 : [#users=1] = get_attr[target=x_equalization_scale0]
%mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_equalization_scale0), kwargs = {})
%x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_0](args = (%mul,), kwargs = {})
%fc : [#users=1] = call_module[target=fc](args = (%x_activation_post_process_0,), kwargs = {})
%fc_activation_post_process_0 : [#users=1] = call_module[target=fc_activation_post_process_0](args = (%fc,), kwargs = {})
return fc_activation_post_process_0
```
After convert:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_equalization_scale0 : [#users=1] = get_attr[target=x_equalization_scale0]
%mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_equalization_scale0), kwargs = {})
%fc_input_scale_0 : [#users=1] = get_attr[target=fc_input_scale_0]
%fc_input_zero_point_0 : [#users=1] = get_attr[target=fc_input_zero_point_0]
%quantize_per_tensor : [#users=1] = call_function[target=torch.quantize_per_tensor](args = (%mul, %fc_input_scale_0, %fc_input_zero_point_0, torch.quint8), kwargs = {})
%fc : [#users=1] = call_module[target=fc](args = (%quantize_per_tensor,), kwargs = {})
%dequantize : [#users=1] = call_method[target=dequantize](args = (%fc,), kwargs = {})
return dequantize
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D29638275
fbshipit-source-id: 40d4666a4451e132612ea38fdfeaaec177a1defb
2021-07-13 14:00:40 -07:00
b3e4dab45a
[quant] Input-Weight Equalization - Conv convert support ( #61287 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61287
Modifications to functions during convert() to support equalization. Note that this implementation does not work for connected F.conv2d layers yet.
Initial:
```
w
|
x -> conv -> y
```
After prepare:
```
w
|
weight_quant_obs
|
weight_eq_obs
|
x -> input_quant_obs -> input_eq_obs -> conv -> out_quant_obs -> y
```
After convert:
```
scale, zero_point w (scaled)
| |
x -> mul -> quantize_per_tensor (scaled) -> quantized::conv -> dequant -> y
|
eq_scale
```
Test Plan:
`python test/test_quantization.py TestEqualizeFx`
Initial model:
```
ConvModel(
(conv): Conv2d(3, 5, kernel_size=(3, 3), stride=(1, 1), bias=False)
)
```
After prepare:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_0](args = (%x,), kwargs = {})
%x_activation_post_process_0_equalization_process_0 : [#users=1] = call_module[target=x_activation_post_process_0_equalization_process_0](args = (%x_activation_post_process_0,), kwargs = {})
%conv : [#users=1] = call_module[target=conv](args = (%x_activation_post_process_0_equalization_process_0,), kwargs = {})
%conv_activation_post_process_0 : [#users=1] = call_module[target=conv_activation_post_process_0](args = (%conv,), kwargs = {})
return conv_activation_post_process_0
```
After equalization functions:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_equalization_scale0 : [#users=1] = get_attr[target=x_equalization_scale0]
%mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_equalization_scale0), kwargs = {})
%x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_0](args = (%mul,), kwargs = {})
%conv : [#users=1] = call_module[target=conv](args = (%x_activation_post_process_0,), kwargs = {})
%conv_activation_post_process_0 : [#users=1] = call_module[target=conv_activation_post_process_0](args = (%conv,), kwargs = {})
return conv_activation_post_process_0
```
After convert:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_equalization_scale0 : [#users=1] = get_attr[target=x_equalization_scale0]
%mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_equalization_scale0), kwargs = {})
%conv_input_scale_0 : [#users=1] = get_attr[target=conv_input_scale_0]
%conv_input_zero_point_0 : [#users=1] = get_attr[target=conv_input_zero_point_0]
%quantize_per_tensor : [#users=1] = call_function[target=torch.quantize_per_tensor](args = (%mul, %conv_input_scale_0, %conv_input_zero_point_0, torch.quint8), kwargs = {})
%conv : [#users=1] = call_module[target=conv](args = (%quantize_per_tensor,), kwargs = {})
%dequantize : [#users=1] = call_method[target=dequantize](args = (%conv,), kwargs = {})
return dequantize
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D29557055
fbshipit-source-id: dc9f44182e31fa362c43ad2dfe224e6f4e4a730e
2021-07-13 14:00:38 -07:00
77d36b657a
[quant] Input-Weight Equalization - Conv prepare support ( #61286 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61286
Modifies the prepare step to support conv layers during input-weight equalization and adds tests to make sure that the results are as expected.
Initial:
```
w
|
x -> conv -> y
```
After prepare:
```
w
|
weight_quant_obs
|
weight_eq_obs
|
x -> input_quant_obs -> input_eq_obs -> conv -> out_quant_obs -> y
```
Test Plan:
`python test/test_quantization.py TestEqualizeFx.test_input_weight_equalization_prepare`
Initial:
```
ConvModel(
(conv): Conv2d(3, 5, kernel_size=(3, 3), stride=(1, 1), bias=False)
)
```
After prepare:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_0](args = (%x,), kwargs = {})
%x_activation_post_process_0_equalization_process_0 : [#users=1] = call_module[target=x_activation_post_process_0_equalization_process_0](args = (%x_activation_post_process_0,), kwargs = {})
%conv : [#users=1] = call_module[target=conv](args = (%x_activation_post_process_0_equalization_process_0,), kwargs = {})
%conv_activation_post_process_0 : [#users=1] = call_module[target=conv_activation_post_process_0](args = (%conv,), kwargs = {})
return conv_activation_post_process_0
```
Imported from OSS
Reviewed By: supriyar
Differential Revision: D29557051
fbshipit-source-id: 25d1531645dfaf565f5c615e2ee850fcf96c7eb9
2021-07-13 14:00:36 -07:00
ce9cedd119
[quant] Input-Weight Equalization - Conv observer support ( #61285 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61285
Modifies observers to support conv layers and tests to make sure that the observers are returning the expected values for conv inputs.
Test Plan:
`python test/test_quantization.py TestEqualizeFx.test_input_weight_eq_observer`
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D29557041
fbshipit-source-id: 5e43329f189ba352eb8b991f38bf37752eebb6e6
2021-07-13 13:59:23 -07:00
dabadd7e20
[quant] Added reset_min_max_vals() function to observers ( #60883 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60883
As per this [comment](https://github.com/pytorch/pytorch/pull/59964#discussion_r659064270 ), I created a `reset_min_max_vals()` function inside the observers which will be called during input-weight equalization. This is so that we will not expose the implementation of the observers in the equalization code.
Test Plan:
`python test/test_quantization.py TestEqualizeFx`
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D29491848
fbshipit-source-id: 00e91959ceb3b4f3688175a1a7ba11823e929b2f
2021-06-30 14:22:08 -07:00
1a0195db49
[quant] Input-Weight Equalization - support for LinearReLU layers ( #60653 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60653
Special casing was needed to get the weight attribute in the linear layers of fused LinearReLU layers.
Initial Model: `x -> linear1 -> relu`
After fusion: `x -> linearRelu`
After prepare: `x -> input_quant_obs -> input_eq_obs1 -> linearRelu -> output_quant_obs1`
After equalization functions: `x -> mul -> input_quant_obs (scaled) -> linearRelu -> output_quant_obs`
After convert: `x -> mul -> quantize_per_tensor -> quantized::linearRelu -> dequantize`
More step-throughs here: https://fb.quip.com/A9J3AsBxkykR
Test Plan:
`python test/test_quantization.py TestEqualizeFx`
Original model:
```
LinearReluModel(
(fc): Linear(in_features=5, out_features=5, bias=True)
(relu): ReLU()
)
```
Graph after `prepare_fx`:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_0](args = (%x,), kwargs = {})
%x_activation_post_process_0_equalization_process_0 : [#users=1] = call_module[target=x_activation_post_process_0_equalization_process_0](args = (%x_activation_post_process_0,), kwargs = {})
%fc : [#users=1] = call_module[target=fc](args = (%x_activation_post_process_0_equalization_process_0,), kwargs = {})
%fc_activation_post_process_0 : [#users=1] = call_module[target=fc_activation_post_process_0](args = (%fc,), kwargs = {})
return fc_activation_post_process_0
```
Graph after equalization functions:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_equalization_scale0 : [#users=1] = get_attr[target=x_equalization_scale0]
%mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_equalization_scale0), kwargs = {})
%x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_0](args = (%mul,), kwargs = {})
%fc : [#users=1] = call_module[target=fc](args = (%x_activation_post_process_0,), kwargs = {})
%fc_activation_post_process_0 : [#users=1] = call_module[target=fc_activation_post_process_0](args = (%fc,), kwargs = {})
return fc_activation_post_process_0
```
Graph after `convert_fx`:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_equalization_scale0 : [#users=1] = get_attr[target=x_equalization_scale0]
%mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_equalization_scale0), kwargs = {})
%fc_input_scale_0 : [#users=1] = get_attr[target=fc_input_scale_0]
%fc_input_zero_point_0 : [#users=1] = get_attr[target=fc_input_zero_point_0]
%quantize_per_tensor : [#users=1] = call_function[target=torch.quantize_per_tensor](args = (%mul, %fc_input_scale_0, %fc_input_zero_point_0, torch.quint8), kwargs = {})
%fc : [#users=1] = call_module[target=fc](args = (%quantize_per_tensor,), kwargs = {})
%dequantize : [#users=1] = call_method[target=dequantize](args = (%fc,), kwargs = {})
return dequantize
```
Imported from OSS
Reviewed By: supriyar
Differential Revision: D29406999
fbshipit-source-id: add38e8e7fb84a241c3b10bfb8451b50103effd4
2021-06-30 14:22:06 -07:00
dfb9c0bae8
[quant] Input-Weight Equalization - support for connected F.linear layer ( #60272 )
...
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60272
Test Plan:
`python test/test_quantization.py TestEqualizeFx`
Original model:
```
FunctionalLinear2Module(
(linear1): Linear()
(linear2): Linear()
)
```
Graph after `prepare_fx`:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_0](args = (%x,), kwargs = {})
%x_activation_post_process_0_equalization_process_0 : [#users=1] = call_module[target=x_activation_post_process_0_equalization_process_0](args = (%x_activation_post_process_0,), kwargs = {})
%linear1_w : [#users=1] = get_attr[target=linear1.w]
%linear1_w_activation_post_process_0 : [#users=1] = call_module[target=linear1_w_activation_post_process_0](args = (%linear1_w,), kwargs = {})
%linear1_w_activation_post_process_0_equalization_process_0 : [#users=1] = call_module[target=linear1_w_activation_post_process_0_equalization_process_0](args = (%linear1_w_activation_post_process_0,), kwargs = {})
%linear1_b : [#users=1] = get_attr[target=linear1.b]
%linear : [#users=1] = call_function[target=torch.nn.functional.linear](args = (%x_activation_post_process_0_equalization_process_0, %linear1_w_activation_post_process_0_equalization_process_0), kwargs = {bias: %linear1_b})
%linear_activation_post_process_0 : [#users=1] = call_module[target=linear_activation_post_process_0](args = (%linear,), kwargs = {})
%linear_activation_post_process_0_equalization_process_0 : [#users=1] = call_module[target=linear_activation_post_process_0_equalization_process_0](args = (%linear_activation_post_process_0,), kwargs = {})
%linear2_w : [#users=1] = get_attr[target=linear2.w]
%linear2_w_activation_post_process_0 : [#users=1] = call_module[target=linear2_w_activation_post_process_0](args = (%linear2_w,), kwargs = {})
%linear2_w_activation_post_process_0_equalization_process_0 : [#users=1] = call_module[target=linear2_w_activation_post_process_0_equalization_process_0](args = (%linear2_w_activation_post_process_0,), kwargs = {})
%linear2_b : [#users=1] = get_attr[target=linear2.b]
%linear_1 : [#users=1] = call_function[target=torch.nn.functional.linear](args = (%linear_activation_post_process_0_equalization_process_0, %linear2_w_activation_post_process_0_equalization_process_0), kwargs = {bias: %linear2_b})
%linear_1_activation_post_process_0 : [#users=1] = call_module[target=linear_1_activation_post_process_0](args = (%linear_1,), kwargs = {})
return linear_1_activation_post_process_0
```
Graph after equalization steps:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_equalization_scale0 : [#users=1] = get_attr[target=x_equalization_scale0]
%mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_equalization_scale0), kwargs = {})
%x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_0](args = (%mul,), kwargs = {})
%linear1_w : [#users=1] = get_attr[target=linear1.w]
%linear1_w_activation_post_process_0 : [#users=1] = call_module[target=linear1_w_activation_post_process_0](args = (%linear1_w,), kwargs = {})
%linear1_b : [#users=1] = get_attr[target=linear1.b]
%linear : [#users=1] = call_function[target=torch.nn.functional.linear](args = (%x_activation_post_process_0, %linear1_w_activation_post_process_0), kwargs = {bias: %linear1_b})
%linear_activation_post_process_0 : [#users=1] = call_module[target=linear_activation_post_process_0](args = (%linear,), kwargs = {})
%linear2_w : [#users=1] = get_attr[target=linear2.w]
%linear2_w_activation_post_process_0 : [#users=1] = call_module[target=linear2_w_activation_post_process_0](args = (%linear2_w,), kwargs = {})
%linear2_b : [#users=1] = get_attr[target=linear2.b]
%linear_1 : [#users=1] = call_function[target=torch.nn.functional.linear](args = (%linear_activation_post_process_0, %linear2_w_activation_post_process_0), kwargs = {bias: %linear2_b})
%linear_1_activation_post_process_0 : [#users=1] = call_module[target=linear_1_activation_post_process_0](args = (%linear_1,), kwargs = {})
return linear_1_activation_post_process_0
```
Graph after `convert_fx`:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_equalization_scale0 : [#users=1] = get_attr[target=x_equalization_scale0]
%mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_equalization_scale0), kwargs = {})
%linear1_input_scale_0 : [#users=1] = get_attr[target=linear1_input_scale_0]
%linear1_input_zero_point_0 : [#users=1] = get_attr[target=linear1_input_zero_point_0]
%quantize_per_tensor : [#users=1] = call_function[target=torch.quantize_per_tensor](args = (%mul, %linear1_input_scale_0, %linear1_input_zero_point_0, torch.quint8), kwargs = {})
%linear1_packed_weight_0 : [#users=1] = get_attr[target=linear1_packed_weight_0]
%linear1_scale_0 : [#users=1] = get_attr[target=linear1_scale_0]
%linear1_zero_point_0 : [#users=1] = get_attr[target=linear1_zero_point_0]
%linear : [#users=1] = call_function[target=torch.ops.quantized.linear](args = (%quantize_per_tensor, %linear1_packed_weight_0, %linear1_scale_0, %linear1_zero_point_0), kwargs = {})
%linear2_packed_weight_0 : [#users=1] = get_attr[target=linear2_packed_weight_0]
%linear2_scale_0 : [#users=1] = get_attr[target=linear2_scale_0]
%linear2_zero_point_0 : [#users=1] = get_attr[target=linear2_zero_point_0]
%linear_1 : [#users=1] = call_function[target=torch.ops.quantized.linear](args = (%linear, %linear2_packed_weight_0, %linear2_scale_0, %linear2_zero_point_0), kwargs = {})
%dequantize : [#users=1] = call_method[target=dequantize](args = (%linear_1,), kwargs = {})
return dequantize
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D29267218
fbshipit-source-id: 6b97bed1a307f1d0b1f5efcbecf41f35418242f7
2021-06-28 10:44:27 -07:00
ddf2ce03bb
[quant] Input-Weight Equalization - support for connected linear layers ( #60034 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60034
Added support for equalizing models with connected linear
layers. To account for connected linear layers, we will additionally
multiply the previous weight values (row-wise) by the next equalization
scale, and remove the input equalization observer between the two linear
layers. We also want to scale the bias by the next equalization scale.
The math is shown here: https://fb.quip.com/fK8rA9aRM4ca .
Original Model: `x -> linear1 -> linear2`
After `prepare_fx`: `x -> InpEqObs -> InpQuantObs -> linear1 ->
OutQuantObs -> InpEqObs -> linear2`
After equalization: `x -> mul -> InpQuantObs -> linear1 -> OutQuantObs
-> linear2`
Test Plan:
`python test/test_quantization.py
TestEqualizeFx.test_input_weight_equalization_convert`
Original Model:
```
Linear2Module(
(linear1): Linear(in_features=2, out_features=2, bias=True)
(linear2): Linear(in_features=2, out_features=2, bias=True)
)
```
Graph after `prepare_fx`:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_0](args = (%x,), kwargs = {})
%x_activation_post_process_0_equalization_process_0 : [#users=1] = call_module[target=x_activation_post_process_0_equalization_process_0](args = (%x_activation_post_process_0,), kwargs = {})
%linear1 : [#users=1] = call_module[target=linear1](args = (%x_activation_post_process_0_equalization_process_0,), kwargs = {})
%linear1_activation_post_process_0 : [#users=1] = call_module[target=linear1_activation_post_process_0](args = (%linear1,), kwargs = {})
%linear1_activation_post_process_0_equalization_process_0 : [#users=1] = call_module[target=linear1_activation_post_process_0_equalization_process_0](args = (%linear1_activation_post_process_0,), kwargs = {})
%linear2 : [#users=1] = call_module[target=linear2](args = (%linear1_activation_post_process_0_equalization_process_0,), kwargs = {})
%linear2_activation_post_process_0 : [#users=1] = call_module[target=linear2_activation_post_process_0](args = (%linear2,), kwargs = {})
return linear2_activation_post_process_0
```
Graph after equaliation functions:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_activation_post_process_0_equalization_process_0_scale : [#users=1] = get_attr[target=x_activation_post_process_0_equalization_process_0_scale]
%mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_activation_post_process_0_equalization_process_0_scale), kwargs = {})
%x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_0](args = (%mul,), kwargs = {})
%linear1 : [#users=1] = call_module[target=linear1](args = (%x_activation_post_process_0,), kwargs = {})
%linear1_activation_post_process_0 : [#users=1] = call_module[target=linear1_activation_post_process_0](args = (%linear1,), kwargs = {})
%linear2 : [#users=1] = call_module[target=linear2](args = (%linear1_activation_post_process_0,), kwargs = {})
%linear2_activation_post_process_0 : [#users=1] = call_module[target=linear2_activation_post_process_0](args = (%linear2,), kwargs = {})
return linear2_activation_post_process_0
```
Graph after `convert_fx`:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_activation_post_process_0_equalization_process_0_scale : [#users=1] = get_attr[target=x_activation_post_process_0_equalization_process_0_scale]
%mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_activation_post_process_0_equalization_process_0_scale), kwargs = {})
%linear1_input_scale_0 : [#users=1] = get_attr[target=linear1_input_scale_0]
%linear1_input_zero_point_0 : [#users=1] = get_attr[target=linear1_input_zero_point_0]
%quantize_per_tensor : [#users=1] = call_function[target=torch.quantize_per_tensor](args = (%mul, %linear1_input_scale_0, %linear1_input_zero_point_0, torch.quint8), kwargs = {})
%linear1 : [#users=1] = call_module[target=linear1](args = (%quantize_per_tensor,), kwargs = {})
%linear2 : [#users=1] = call_module[target=linear2](args = (%linear1,), kwargs = {})
%dequantize : [#users=1] = call_method[target=dequantize](args = (%linear2,), kwargs = {})
return dequantize
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D29204347
fbshipit-source-id: 6bb9e25e2468f50df523885ded2edc731f002ac1
2021-06-28 10:44:25 -07:00
7917318917
[quant] Input-Weight Equalization - support for F.linear layers ( #59964 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59964
Input-Weight Equalization support for functional layers
Test Plan:
`python test/test_quantization.py
TestEqualizeFx.test_input_weight_equalization_convert`
Original model:
```
FunctionalLinearModule(
(linear1): Linear()
)
```
Graph after `prepare_fx`:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_equalization_process_0 : [#users=1] = call_module[target=x_equalization_process_0](args = (%x,), kwargs = {})
graph():
%x : [#users=1] = placeholder[target=x]
%x_equalization_process_0 : [#users=1] = call_module[target=x_equalization_process_0](args = (%x,), kwargs = {})
%x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_00](args = (%x_equalization_process_0,), kwargs = {})
%linear1_w : [#users=1] = get_attr[target=linear1.w]
%linear1_w_equalization_process_0 : [#users=1] = call_module[target=linear1_w_equalization_process_0](args = (%linear1_w,), kwargs = {})
%linear1_w_activation_post_process_0 : [#users=1] = call_module[target=linear1_w_activation_post_process_00](args = (%linear1_w_equalization_process_0,), kwargs = {})
%linear1_b : [#users=1] = get_attr[target=linear1.b]
%linear : [#users=1] = call_function[target=torch.nn.functional.linear](args = (%x_activation_post_process_0, %linear1_w_activation_post_process_0), kwargs = {bias: %linear1_b})
%linear_activation_post_process_0 : [#users=1] = call_module[target=linear_activation_post_process_0](args = (%linear,), kwargs = {})
return linear_activation_post_process_0
```
Graph after equalization functions:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_equalization_process_0_scale : [#users=1] = get_attr[target=x_equalization_process_0_scale]
%mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_equalization_process_0_scale), kwargs = {})
%x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_00](args = (%mul,), kwargs = {})
%linear1_w : [#users=1] = get_attr[target=linear1.w]
%linear1_w_equalization_process_0 : [#users=1] = call_module[target=linear1_w_equalization_process_0](args = (%linear1_w,), kwargs = {})
%linear1_w_activation_post_process_0 : [#users=1] = call_module[target=linear1_w_activation_post_process_00](args = (%linear1_w_equalization_process_0,), kwargs = {})
%linear1_b : [#users=1] = get_attr[target=linear1.b]
%linear : [#users=1] = call_function[target=torch.nn.functional.linear](args = (%x_activation_post_process_0, %linear1_w_activation_post_process_0), kwargs = {bias: %linear1_b})
%linear_activation_post_process_0 : [#users=1] = call_module[target=linear_activation_post_process_0](args = (%linear,), kwargs = {})
return linear_activation_post_process_0
```
Graph after `convert_fx`:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_equalization_process_0_scale : [#users=1] = get_attr[target=x_equalization_process_0_scale]
%mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_equalization_process_0_scale), kwargs = {})
%linear1_input_scale_0 : [#users=1] = get_attr[target=linear1_input_scale_0]
%linear1_input_zero_point_0 : [#users=1] = get_attr[target=linear1_input_zero_point_0]
%quantize_per_tensor : [#users=1] = call_function[target=torch.quantize_per_tensor](args = (%mul, %linear1_input_scale_0, %linear1_input_zero_point_0, torch.quint8), kwargs = {})
%linear1_packed_weight_0 : [#users=1] = get_attr[target=linear1_packed_weight_0]
%linear1_scale_0 : [#users=1] = get_attr[target=linear1_scale_0]
%linear1_zero_point_0 : [#users=1] = get_attr[target=linear1_zero_point_0]
%linear : [#users=1] = call_function[target=torch.ops.quantized.linear](args = (%quantize_per_tensor, %linear1_packed_weight_0, %linear1_scale_0, %linear1_zero_point_0), kwargs = {})
%dequantize : [#users=1] = call_method[target=dequantize](args = (%linear,), kwargs = {})
return dequantize
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D29135459
fbshipit-source-id: 1e69bfbb82a0c89538e55b64968effd0b11b2fde
2021-06-28 10:44:24 -07:00
e13a9587b4
Revert "Revert D29135358: [quant] Input-Weight Equaliaztion - convert modifications" ( #60646 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60646
This reverts commit e60f9cfc58fb2fe3e2e7f65fcdbbf350e5b55a75.
Test Plan: Imported from OSS
Reviewed By: supriyar
Differential Revision: D29361191
Pulled By: angelayi
fbshipit-source-id: 275d8691d8e47da4ab80bb21b51d77ec25a0f714
2021-06-25 15:37:05 -07:00
e60f9cfc58
Revert D29135358: [quant] Input-Weight Equaliaztion - convert modifications
...
Test Plan: revert-hammer
Differential Revision:
D29135358 (3de79b7757
)
Original commit changeset: 2d0005672904
fbshipit-source-id: cac30c1202ebbce4f22e50ed920340c7b4c6849f
2021-06-23 11:23:24 -07:00
3de79b7757
[quant] Input-Weight Equaliaztion - convert modifications ( #59963 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59963
When converting, before quantizing the nodes, we call
`update_obs_for_equalization()` and `convert_eq_obs()`.
`update_obs_for_equalization`:
1. For each InputEqualizationObserver, we find the corresponding
WeightEqualizationObserver.
2. For nn.Linear layers, we will create an instance of the
WeightEqualizationObserver, run forward on the observer with the given
weights.
3. Calculate the equalization scale between the
InputEqualizationObserver and WeightEqualizationObserver.
`convert_eq_obs`:
For every InputEqualizationObserver, we will do the following:
1. Create a node (ex. `x0_activation_post_process_scale`) containing the
equalization scale constant.
2. Create another node containing a `mul` operator multiplying the
equalization scale and the input.
3. Remove the current InputEqualizationObserver node, and replace it
with the `mul` node.
For every WeightEqualizationObserver, we will do the following:
1. Get the next equalization scale (we may need this for equalizing
connected linear layers).
2. Scale the weights by multiplying it with the reciprocal of the
current equalization scale and the next equalization scale
Currently, this supports models with `nn.Linear` layers, but does not
support connecting linear layers.
Test Plan:
`python test/test_quantization.py
TestEqualizeFx.test_input_weight_equalization_convert`
Original Model:
```
.LinearModule(
(linear): Linear(in_features=2, out_features=2, bias=True)
)
```
Graph after `prepare_fx`:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_equalization_process_0 : [#users=1] = call_module[target=x_equalization_process_0](args = (%x,), kwargs = {})
%x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_00](args = (%x_equalization_process_0,), kwargs = {})
%linear : [#users=1] = call_module[target=linear](args = (%x_activation_post_process_0,), kwargs = {})
%linear_activation_post_process_0 : [#users=1] = call_module[target=linear_activation_post_process_0](args = (%linear,), kwargs = {})
return linear_activation_post_process_0
```
Graph after equalization functions:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_equalization_process_0_scale : [#users=1] = get_attr[target=x_equalization_process_0_scale]
%mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_equalization_process_0_scale), kwargs = {})
%x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_00](args = (%mul,), kwargs = {})
%linear : [#users=1] = call_module[target=linear](args = (%x_activation_post_process_0,), kwargs = {})
%linear_activation_post_process_0 : [#users=1] = call_module[target=linear_activation_post_process_0](args = (%linear,), kwargs = {})
return linear_activation_post_process_0
```
Graph after `convert_fx`:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_equalization_process_0_scale : [#users=1] = get_attr[target=x_equalization_process_0_scale]
%mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_equalization_process_0_scale), kwargs = {})
%linear_input_scale_0 : [#users=1] = get_attr[target=linear_input_scale_0]
%linear_input_zero_point_0 : [#users=1] = get_attr[target=linear_input_zero_point_0]
%quantize_per_tensor : [#users=1] = call_function[target=torch.quantize_per_tensor](args = (%mul, %linear_input_scale_0, %linear_input_zero_point_0, torch.quint8), kwargs = {})
%linear : [#users=1] = call_module[target=linear](args = (%quantize_per_tensor,), kwargs = {})
%dequantize : [#users=1] = call_method[target=dequantize](args = (%linear,), kwargs = {})
return dequantize
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D29135358
fbshipit-source-id: 2d00056729041318463de61841483490b6bfeee5
2021-06-22 20:43:30 -07:00
c0b7c59e55
[quant] Equalization Observer modifications ( #59953 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59953
The following modifications were made to the equalization
observers due to design changes:
- [InputEqualizationObserver] Replaced `calculate_qparams()` with
`calculate_scaled_minmax()` since we will need to return the scaled
min/max values to update the following input quantization observer
- [WeightEqualizationObserver] We no longer need a row observer since
this will be taken care of by the following weight quantization observer
- [WeightEqualizationObserver] Following the previous comment, we no
longer need to calculate the scaled qparam values. Instead, we will use
the equalization scale to later scale the weights and the qparams will
be taken care of by the weight quantization observer.
Test Plan:
`python test/test_quantization.py
TestEqualizeFx.test_input_weight_eq_observer`
Imported from OSS
Reviewed By: supriyar
Differential Revision: D29135332
fbshipit-source-id: be7e468273c8b62fc183b1e1ec50f6bd6d8cf831
2021-06-16 22:32:30 -07:00
45c31cabb5
[quant] Input Weight Equalization - prepare modifications ( #59747 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59747
Modifies prepare_fx for input-weight equalization. If a current
node is being equalized (there exists a EqualizationQConfig), then the
EqualizationObserver will be inserted before its quantization observer.
For a singular linear layer, the general flow looks like:
Original graph: `x0 -> linear -> x1`, `w -> linear`
After prepare: `x0 -> InpEqObs -> MinMaxObs -> linear1 -> MinMaxObs -> x1`
`w -> WeightEqObs -> MinMaxObs -> linear1`
For two connected linear layers, the general flow looks like:
Original graph: `x0 -> linear1 -> linear2 -> x1`,
`w1 -> linear1`, `w2 -> linear2`
After prepare: `x0 -> InpEqObs -> MinMaxObs -> linear1 -> MinMaxObs -> InpEqObs -> linear2 -> MinMaxObs -> x1`
`w1 -> WeightEqObs -> MinMaxObs -> linear1`, `w2 -> WeightEqObs -> MinMaxObs -> linear2
Test Plan:
`python test/test_quantization.py
TestEqualizeFx.test_input_equalization_prepare`
Original model with one `nn.Linear` layer
```
LinearModule(
(linear): Linear(in_features=1, out_features=1, bias=True)
)
```
Graph after `prepare_fx`:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_equalization_process_0 : [#users=1] = call_module[target=x_equalization_process_0](args = (%x,), kwargs = {})
%x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_00](args = (%x_equalization_process_0,), kwargs = {})
%linear : [#users=1] = call_module[target=linear](args = (%x_activation_post_process_0,), kwargs = {})
%linear_activation_post_process_0 : [#users=1] = call_module[target=linear_activation_post_process_0](args = (%linear,), kwargs = {})
return linear_activation_post_process_0
```
--------------------------------------
Original model with two connected functional linear layers
```
FunctionalLinearModule(
(linear1): Linear()
(linear2): Linear()
)
```
Graph after `prepare_fx`:
```
graph():
%x : [#users=1] = placeholder[target=x]
%x_equalization_process_0 : [#users=1] = call_module[target=x_equalization_process_0](args = (%x,), kwargs = {})
%x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_00](args = (%x_equalization_process_0,), kwargs = {})
%linear1_w : [#users=1] = get_attr[target=linear1.w]
%linear1_w_equalization_process_0 : [#users=1] = call_module[target=linear1_w_equalization_process_0](args = (%linear1_w,), kwargs = {})
%linear1_w_activation_post_process_0 : [#users=1] = call_module[target=linear1_w_activation_post_process_00](args = (%linear1_w_equalization_process_0,), kwargs = {})
%linear1_b : [#users=1] = get_attr[target=linear1.b]
%linear : [#users=1] = call_function[target=torch.nn.functional.linear](args = (%x_activation_post_process_0, %linear1_w_activation_post_process_0), kwargs = {bias: %linear1_b})
%linear_activation_post_process_0 : [#users=1] = call_module[target=linear_activation_post_process_0](args = (%linear,), kwargs = {})
%linear_activation_post_process_0_equalization_process_0 : [#users=1] = call_module[target=linear_activation_post_process_0_equalization_process_0](args = (%linear_activation_post_process_0,), kwargs = {})
%linear2_w : [#users=1] = get_attr[target=linear2.w]
%linear2_w_equalization_process_0 : [#users=1] = call_module[target=linear2_w_equalization_process_0](args = (%linear2_w,), kwargs = {})
%linear2_w_activation_post_process_0 : [#users=1] = call_module[target=linear2_w_activation_post_process_00](args = (%linear2_w_equalization_process_0,), kwargs = {})
%linear2_b : [#users=1] = get_attr[target=linear2.b]
%linear_1 : [#users=1] = call_function[target=torch.nn.functional.linear](args = (%linear_activation_post_process_0_equalization_process_0, %linear2_w_activation_post_process_0), kwargs = {bias: %linear2_b})
%linear_1_activation_post_process_0 : [#users=1] = call_module[target=linear_1_activation_post_process_0](args = (%linear_1,), kwargs = {})
return linear_1_activation_post_process_0
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D29135316
fbshipit-source-id: 91697e805ede254dbb2a42ee4c23eb1c1c64590e
2021-06-16 22:32:28 -07:00
7ce74f3339
[quant] EqualizationQConfig to distinguish input/output activations ( #59739 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59739
Created an EqualizationQConfig specifically for equalization.
This inherits from QConfig and is used to distinguish between inserting
an input observer with an output observer. Since the output observer
field is included in the EqualizationQConfig, we no longer need an
output observer field in the _InputEqualizationObserver
Test Plan:
compiles
Imported from OSS
Reviewed By: ezyang
Differential Revision: D29135298
fbshipit-source-id: 3dde9c029c291467ff0a0845f0fc9c44573fc6f6
2021-06-16 22:31:18 -07:00
cc03ea2c47
[quant] Implemented InputWeightObserver for Linear inputs
...
Summary: Implemented two observers (InputEqualObserver and WeightEqualObserver) which will be inserted into the graph during prepare_fx().
Test Plan: python test/test_quantization.py TestEqualizeFx
Reviewed By: supriyar
Differential Revision: D28836954
fbshipit-source-id: 25517dc82ae67698ed8b2dc334e3323286976104
2021-06-07 11:19:43 -07:00