Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64919
AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly. This migrates the quantization utilities.
ghstack-source-id: 138303325
Test Plan: `buck test mode/dev //caffe2/test:quantization`
Reviewed By: jerryzh168
Differential Revision: D30899082
fbshipit-source-id: 85eb38c419e417147e71758b682cd095308dd0c9
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64916
AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly.
This migrates the quant_type.py from torch.quantization to torch.ao.quantization.
At this point both locations will be supported. Eventually the torch.quantization will be deprecated.
Test Plan: `buck test mode/dev //caffe2/test:quantization -- TestAOMigrationQuantization`
Reviewed By: vkuzo
Differential Revision: D30898422
fbshipit-source-id: 3e6126b49f0565a4136d6928cea9eb25368927ff
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63107
moving this function because the functionality would be useful outside of ns
ghstack-source-id: 135727260
Test Plan: buck test //caffe2/test:quantization_fx mode/dev-nosan --keep-going --config client.id=nuclide --show-full-output -- suite
Reviewed By: supriyar
Differential Revision: D30260735
fbshipit-source-id: 58deabdd0f3b03b0ee7ee92be0548a0945084d65
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62863
To make this consistent with other observers, add reduce_range option that can be used to update quant_min/max
Test Plan:
python test/test_quantization.py test_fused_mod_reduce_range
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D30146602
fbshipit-source-id: a2015f095766f9c884611e9ab6942528bc9bc972
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62488
Instead of attaching weight observer/fake_quant to the float linear and conv, we can
compute the quantization parameters and attach that as a dictionary to these modules so
that we can reduce the model size and make the reference module clearer
TODO: the numerics for linear and conv in reference quantized model is still not correct since
we did not quantize weight, we may explore things like parameterization to implement this support
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D30053979
fbshipit-source-id: b5f8497cf6cf65eec924df2d8fb10a9e154b8cab
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59953
The following modifications were made to the equalization
observers due to design changes:
- [InputEqualizationObserver] Replaced `calculate_qparams()` with
`calculate_scaled_minmax()` since we will need to return the scaled
min/max values to update the following input quantization observer
- [WeightEqualizationObserver] We no longer need a row observer since
this will be taken care of by the following weight quantization observer
- [WeightEqualizationObserver] Following the previous comment, we no
longer need to calculate the scaled qparam values. Instead, we will use
the equalization scale to later scale the weights and the qparams will
be taken care of by the weight quantization observer.
Test Plan:
`python test/test_quantization.py
TestEqualizeFx.test_input_weight_eq_observer`
Imported from OSS
Reviewed By: supriyar
Differential Revision: D29135332
fbshipit-source-id: be7e468273c8b62fc183b1e1ec50f6bd6d8cf831
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53585
Previously fp16_static CopyNode would be marked as unquantized because of
an incorrect condition check of whether a Node is statically quantized or not.
This PR fixes that.
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D26912677
fbshipit-source-id: 4ddb538714c5ba2db28430de5e1cf2931baf1993
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52534
Currently linear_dynamic_fp16 has a signature that's tied to fbgemm/qnnpack
We'll need to produce a pattern equivalent to linear_dynamic_fp16 to support extensions
to other backends
Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_linear_dynamic_fp16
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D26557726
fbshipit-source-id: 270c9f781f73c79416a092b7831294cabca84b0c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50304
Does not include any functional changes -- purely for fixing minor typos in the `fuser_method_mappings.py`
Test Plan: Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D25857248
Pulled By: z-a-f
fbshipit-source-id: 3f9b864b18bda8096e7cd52922dc21be64278887