Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64814
1. move the file
```
hg mv caffe2/torch/quantization/fake_quantize.py caffe2/torch/ao/quantization/
```
2. create a new file in the old location and copy the imports
3. fix all callsites inside `torch`
Test Plan:
```
buck test mode/dev //caffe2/test:quantization
```
Reviewed By: z-a-f
Differential Revision: D30866792
fbshipit-source-id: 7a221cb46c0ab01f1c5de9be061f09ecc83ce23e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64861
1. move the file
```
hg mv caffe2/torch/quantization/stubs.py caffe2/torch/ao/quantization/
```
2. create a new file in the old location and copy the imports
3. fix all call sites inside `torch`
ghstack-source-id: 137885365
Test Plan: buck test mode/dev //caffe2/test:quantization
Reviewed By: jerryzh168
Differential Revision: D30879678
fbshipit-source-id: a2d24f25d01064212aca15e94e8c78240ba48953
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63828
Added reference quantized conv module for the custom backend flow, the reference quantized module will
have the following code:
```
w(float) -- quant - dequant \
x(float) ------------- F.conv2d ---
```
In the full model, we will see
```
w(float) -- quant - *dequant \
x -- quant --- *dequant -- *F.conv2d --- *quant - dequant
```
and the backend should be able to fuse the ops with `*` into a quantized linear
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_conv_linear_reference
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D30504749
fbshipit-source-id: e1d8c43a0e0d6d9ea2375b8ca59a9c0f455514fb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63627
Added reference quantized linear module for the custom backend flow, the reference quantized module will
have the following code:
```
w(float) -- quant - dequant \
x(float) ------------- F.linear ---
```
In the full model, we will see
```
w(float) -- quant - *dequant \
x -- quant --- *dequant -- *F.linear --- *quant - dequant
```
and the backend should be able to fuse the ops with `*` into a quantized linear
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_conv_linear_reference
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D30504750
fbshipit-source-id: 5729921745c2b6a0fb344efc3689f3b170e89500
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63799
Add a new module that can be used for module swap with the nni.LinearReLU module in convert function.
Supports INT8 currently (since FP16 op doesn't have relu fusion yet).
Fixes#55393
Test Plan:
python test/test_quantization.py test_dynamic_fusion
Imported from OSS
Reviewed By: heitorschueroff
Differential Revision: D30502812
fbshipit-source-id: 3668e4f001a0626d469e17ac323acf582ee28a51
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55837
Adds a test that checks that all of the relevant op pairs defined in
`quantization_mappings.py` are also defined as related by Numerical
Suite.
Note: this does not cover all the ops, just the ones in
`quantization_mappings.py`. A future PR will fill out the remainder.
Test Plan:
```
python test/test_quantization.py TestFXGraphMatcher.test_op_relationship_mapping
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27719979
fbshipit-source-id: 9e852ef94da5f7a653ea15ba52c68a89c8e30208
Summary:
Fixes https://github.com/pytorch/pytorch/issues/50002
The last commit adds tests for 3d conv with the `SubModelFusion` and `SubModelWithoutFusion` classes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50003
Reviewed By: mrshenli
Differential Revision: D26325953
Pulled By: jerryzh168
fbshipit-source-id: 7406dd2721c0c4df477044d1b54a6c5e128a9034
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50459
Some of the custom modules cannot have the observers be inserted automatically. This PR factors out that list into a separate function.
Test is not required, as it is covered by the unittests for those modules.
(Note: this ignores all push blocking failures!)
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D26092531
fbshipit-source-id: 1f89daf3a13ef31bc4e9058c3443559c65a05812
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50297
Current implementation has a potential bug: if a user modifies the quantization mappings returned by the getters, the changes will propagate.
For example, the bug will manifest itself if the user does the following:
```
my_mapping = get_default_static_quant_module_mappings()
my_mapping[nn.Linear] = UserLinearImplementation
model_A = convert(model_A, mapping=my_mapping)
default_mapping = get_default_static_quant_module_mappings()
model_B = convert(model_B, mapping=default_mapping)
```
In that case the `model_B` will be quantized with with the modified mapping.
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D25855753
Pulled By: z-a-f
fbshipit-source-id: 0149a0c07a965024ba7d1084e89157a9c8fa1192
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49964
`torch.nn.modules.linear._LinearWithBias` is only used in the transformers, and is completely identical to the `torch.nn.Linear`.
This PR creates a mapping so that this module would be treated the same as the Linear.
Test Plan:
```
python test/test_quantization.py TestDynamicQuantizedModule TestStaticQuantizedModule
```
Differential Revision: D25731589
Reviewed By: jerryzh168
Pulled By: z-a-f
fbshipit-source-id: 1b2697014e250e97d3010cdb542f9d130b71fbc3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49428
Previously dequantstub will be swapped with nn.quantized.DeQuantize regardless of qconfig
reason is we skipped attaching qconfig for DeQuantStub to avoid adding fake quantize module to it
but the correct fix is to skip it in insert observers, this PR fixes the issue.
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D25569991
fbshipit-source-id: d44a08c6e64c7a49509687dc389b57de1cbb878c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48038
nn.ReLU works for both float and quantized input, we don't want to define an nn.quantized.ReLU
that does the same thing as nn.ReLU, similarly for nn.quantized.functional.relu
this also removes the numerical inconsistency for models quantizes nn.ReLU independently in qat mode
Test Plan:
Imported from OSS
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D25000462
fbshipit-source-id: e3609a3ae4a3476a42f61276619033054194a0d2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47415
nn.ReLU works for both float and quantized input, we don't want to define an nn.quantized.ReLU
that does the same thing as nn.ReLU, similarly for nn.quantized.functional.relu
this also removes the numerical inconsistency for models quantizes nn.ReLU independently in qat mode
Test Plan: Imported from OSS
Reviewed By: z-a-f
Differential Revision: D24747035
fbshipit-source-id: b8fdf13e513a0d5f0c4c6c9835635bdf9fdc2769
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46337
We plan to pass around the mappings instead of using global registration api to keep
the mappings local to the transformations user is performing
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D24317436
fbshipit-source-id: 81569b88f05eeeaa9595447e482a12827aeb961f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45712
Eager mode will still be able to use functional leaky relu, but it will be less accurate than
LeakyReLU module.
FX graph mode will support both leaky relu functional and module
Test Plan: Imported from OSS
Reviewed By: z-a-f
Differential Revision: D24069961
fbshipit-source-id: 8d91c3c50c0bcd068ba3072378ebb4da9549be3b