Summary: same function in observer and quantize, consolidated to a
single function. Note the definitions were slightly different, I've
changed the definition to be maximally inclusive so that the name of the
function is more accurate
Test Plan: python test/test_public_bindings.py
python test/test_quantization.py
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: [D40709276](https://our.internmc.facebook.com/intern/diff/D40709276)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87520
Approved by: https://github.com/jcaip
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64910
This bled through from the original location. Removing it is not just refactoring, but also prevents potential recursive imports.
ghstack-source-id: 138112663
Test Plan: `buck test mode/dev //caffe2/test:quantization`
Reviewed By: vkuzo
Differential Revision: D30882924
fbshipit-source-id: 8652a334a5186c635761ea5e50f978d1f1078c12
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64445
AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly.
This migrates the quantize.py from torch.quantization to torch.ao.quantization.
At this point both locations will be supported. Eventually the torch.quantization will be deprecated.
Test Plan: `buck test mode/dev //caffe2/test:quantization`
Reviewed By: HDCharles
Differential Revision: D30734870
fbshipit-source-id: dc204f3cc46bff2cc81c95159eab9d333b43bb4b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64086
AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly.
This migrates the `quantize.py` from torch.quantization to `torch.ao.quantization`.
At this point both locations will be supported. Eventually the torch.quantization will be deprecated.
Test Plan: `buck test mode/opt //caffe2/test:quantization`
Reviewed By: jerryzh168, raghuramank100
Differential Revision: D30055886
fbshipit-source-id: 8ef7470f9fa640c0042bef5bb843e7a05ecd0b9f
Summary:
This PR enables gpu only quantization, best used with is_reference since
there are not many gpu kernels for ops as of now.
This PR mainly changes how qconfigs and their obs constructors operate once they
on modules qconfig. The function add_module_to_qconfig_obs_ctr takes the obs constructors on the original
qconfig, and configures them so that when invoked, the created obs will
be on whatever device the module occupies. (Once observers are created,
module.to(device) is already setup so that it moves any observers). To do this,
a new method and a few small chanegs were added to the _PartialWrapper class that
our observers already use to create constructors (without changing the
existing functionality). These changes work in
concert with changes to the prepare flow such that when the qconfigs are
propagated to the moduels (in quantize.py and qconfig_utils.py) they are configured using add_module_to_qconfig_obs_ctr.
Ideally this would work on other models but the is_reference support for
a lot of modules isn't there yet, those tests should be added in a
future PR
Test Plan:
python test/test_quantization.py TestQuantizeFxModels.test_static_gpu_convert_basic
python test/test_quantization.py TestQuantizeFxModels.test_switch_device_prepare_convert
python test/test_quantization.py TestQuantizeFxModels.test_prepare_serialize_switch_device_convert
python test/test_quantization.py TestQuantizeFx.test_qconfig_precedence
Reviewed By: vkuzo
Differential Revision: D29684114
fbshipit-source-id: 19fefb8e1998eaf212723e836276ccf39467f2e7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55388
temporarily revert D27314678 (c57541ce06), it appears to cause a perf regression that makes quantization of some models take too long to complete tests.
Reviewed By: houseroad
Differential Revision: D27583809
fbshipit-source-id: e9c088ccbfd3bfb3a1d4c7eafee3eca29ee7717b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54644
Previously we special case copy operator in normal insert observer code, this PR tries to split the
special case logic to a separate function and keep the rest of the code clean.
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D27314678
fbshipit-source-id: d36870ceb3717bc01eaeaa6f3f1532ad562cbaf1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50459
Some of the custom modules cannot have the observers be inserted automatically. This PR factors out that list into a separate function.
Test is not required, as it is covered by the unittests for those modules.
(Note: this ignores all push blocking failures!)
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D26092531
fbshipit-source-id: 1f89daf3a13ef31bc4e9058c3443559c65a05812
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49866
- Adds the `torch.nn.quantizable.MultiheadAttention`
The quantizable version can serve as a fully equivalent to `torch.nn.MultiheadAttention` module.
The main difference is that it allows for linear units observation after the `prepare` step in the quantization flow.
Note: The `from_observed` method (called during the `convert`) removes the `bias_k` and `bias_v` parameters, and resets them as attributes.
This is done to avoid an error of assigning a quantized tensor to the `torch.nn.Parameter`.
(Note: this ignores all push blocking failures!)
Test Plan:
```
python test/test_quantization.py TestQuantizedOps.test_custom_module_multi_head_attention
```
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D25706179
fbshipit-source-id: e27ab641d8d1eccc64cf9e44343459331f89eea4
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49813https://github.com/pytorch/pytorch/issues/49739 reports a crash
where removing forward hooks results in a
```
RuntimeError: OrderedDict mutated during iteration
```
Unfortunately I cannot repro this inside the PyTorch module, but the issue
author has a good point and and we should not mutate the dict inside
of the iteration.
Test Plan:
```
// test plan from https://github.com/pytorch/pytorch/pull/46871 which
// originally added this
python test/test_quantization.py TestEagerModeQATOps
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D25698725
fbshipit-source-id: 13069d0d5017a84038c8f7be439a3ed537938ac6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49899
Per channel weights observer in conv transpose is not supported yet. Adding an
error message which fails instantly instead of making the user wait until after
calibration/training finishes.
Test Plan:
```
python test/test_quantization.py TestPostTrainingStatic.test_convtranspose_per_channel_fails_early
python test/test_quantization.py TestQuantizeFx.test_convtranspose_per_channel_fails_early
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D25717151
fbshipit-source-id: 093e5979030ec185e3e0d56c45d7ce7338bf94b6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49671
- Introduces the `torch.nn.quantizable` namespace
- Adds the `torch.nn.quantizable.LSTM` module
The point of the `quantizable` namespace is to segregate the purely quantized modules with the modules that could be quantized through a normal quantization flow, but are not using the quantized kernels explicitly.
That means the quantizable modules are functionally and numerically equivalent to the FP ones and can be used instead of the FP ones without any loss.
The main difference between the `torch.nn.LSTM` and the `torch.nn.quantizable.LSTM` is that the former one does not support observation for the linear layers, because all the computation is internal to the `aten` namespace.
The `torch.nn.quantizable.LSTM`, however, uses explicit linear layers that can be observed for further quantization.
Test Plan: Imported from OSS
Differential Revision: D25663870
Reviewed By: vkuzo
Pulled By: z-a-f
fbshipit-source-id: 70ff5463bd759b9a7922571a5712d3409dfdfa06
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49428
Previously dequantstub will be swapped with nn.quantized.DeQuantize regardless of qconfig
reason is we skipped attaching qconfig for DeQuantStub to avoid adding fake quantize module to it
but the correct fix is to skip it in insert observers, this PR fixes the issue.
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D25569991
fbshipit-source-id: d44a08c6e64c7a49509687dc389b57de1cbb878c
Summary:
This fix allows the calibration function to take in more than one positional argument.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48537
Reviewed By: zhangguanheng66
Differential Revision: D25255764
Pulled By: jerryzh168
fbshipit-source-id: 3ce20dbed95fd26664a186bd4a992ab406bba827
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48069
also renamed float_qparam_dynamic_qconfig to float_qparam_weight_only_qconfig
It's not used in user code yet so we only need to update the tests.
Test Plan: Imported from OSS
Reviewed By: supriyar
Differential Revision: D25010175
fbshipit-source-id: caa3eaa5358a8bc5c808bf5f64e6ebff3e0b61e8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48038
nn.ReLU works for both float and quantized input, we don't want to define an nn.quantized.ReLU
that does the same thing as nn.ReLU, similarly for nn.quantized.functional.relu
this also removes the numerical inconsistency for models quantizes nn.ReLU independently in qat mode
Test Plan:
Imported from OSS
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D25000462
fbshipit-source-id: e3609a3ae4a3476a42f61276619033054194a0d2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47415
nn.ReLU works for both float and quantized input, we don't want to define an nn.quantized.ReLU
that does the same thing as nn.ReLU, similarly for nn.quantized.functional.relu
this also removes the numerical inconsistency for models quantizes nn.ReLU independently in qat mode
Test Plan: Imported from OSS
Reviewed By: z-a-f
Differential Revision: D24747035
fbshipit-source-id: b8fdf13e513a0d5f0c4c6c9835635bdf9fdc2769
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46337
We plan to pass around the mappings instead of using global registration api to keep
the mappings local to the transformations user is performing
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D24317436
fbshipit-source-id: 81569b88f05eeeaa9595447e482a12827aeb961f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46292
since it is not needed
Test Plan: Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D24290815
fbshipit-source-id: 5cc24a305dbdfee5de3419dc83a9c3794d949300
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46095
Adds logging on usage of public quantization APIs. This only works in FB codebase
and is a no-op in OSS.
Test Plan: The test plan is fb-only
Reviewed By: raghuramank100
Differential Revision: D24220817
fbshipit-source-id: a2cc957b5a077a70c318242f4a245426e48f75e5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44835
This is for feature parity with fx graph mode quantization
Test Plan: Imported from OSS
Reviewed By: z-a-f
Differential Revision: D23745086
fbshipit-source-id: ae2fc86129f9896d5a9039b73006a4da15821307
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43015
Currently activation_post_process are inserted by default in qat modules, which is not
friendly to automatic quantization tools, this PR removes them.
Test Plan:
Imported from OSS
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D23105059
fbshipit-source-id: 3439ac39e718ffb0390468163bcbffd384802b57
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42343
Currently activation_post_process are inserted by default in qat modules, which is not
friendly to automatic quantization tools, this PR removes them.
Test Plan: Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D22856816
fbshipit-source-id: 988a43bce46a992b38fd0d469929f89e5b046131
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42576
Previously we have qconfig propagate list and we only attach qconfig for modules
in the list, this works when everything is quantized in the form of module.
but now we are expanding quantization for functional/torch ops, we'll need to attach qconfig
to all modules
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D22939453
fbshipit-source-id: 7d6a1f73ff9bfe461b3afc75aa266fcc8f7db517
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41930
As title
ghstack-source-id: 108517079
Test Plan: CI
Reviewed By: jerryzh168
Differential Revision: D22698386
fbshipit-source-id: 4f748c9bae4a0b615aa69c7cc8d8e451e5d26863
Summary:
Added a logic so that if a prehook is passed into the prepare method during quantization, then the hook will be added as a prehook to all leaf nodes (and modules specified in the non_leaf_module_list).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41863
Test Plan:
Small demo, made simple module then called prepare with prehook parameter set to the numeric suite logger, printed the results to verify its what we wanted
{F245156246}
Reviewed By: jerryzh168
Differential Revision: D22671288
Pulled By: edmundw314
fbshipit-source-id: ce65a00830ff03360a82c0a075b3b6d8cbc4362e
Summary:
1. While do convert() preserve module's **pre and post forward** hooks
2. While do fusion preserve only module's **pre forward** hooks (because after fusion output no longer the same)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37233
Differential Revision: D22425141
Pulled By: jerryzh168
fbshipit-source-id: e69b81821d507dcd110d2ff3594ba94b9593c8da