pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-22 14:15:01 +08:00

Author	SHA1	Message	Date
leslie-fang-intel	ef4118e435	[Quant][FX] Lower QConvAdd2d for onednn backend (#91153 ) Summary Add quantization mappings for QConvAdd2d for int8 inference for onednn backend. The fusion and lowering is supported only in FX mode. Test plan ``` python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_onednn python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_by_default python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_lowering ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/91153 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-02-01 01:14:12 +00:00
leslie-fang-intel	53c3555a6a	[Quant] Add fused ConvAdd2d module for onednn backend (#91152 ) Summary Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `ConvAdd2d` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown. Test plan ``` python -m pytest test_quantization.py -k test_conv2d_add ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/91152 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-02-01 01:11:25 +00:00
Xia, Weiwen	a5eb564ba4	[Quant] lower fused LinearTanh for onednn backend (#89188 ) Summary Add fuser method and quantization mappings for `QLinearLeakyReLU` for int8 inference for onednn backend. The fusion and lowering are supported only in FX mode. Test plan python test_quantization.py TestFuseFx TestQuantizeFx Pull Request resolved: https://github.com/pytorch/pytorch/pull/89188 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2022-12-20 01:30:21 +00:00
Xia, Weiwen	7b0ec67e34	[Quant][FX] Add backend config for onednn backend and fuse Linear-LeakyReLU (#88665 ) Summary Add backend config for onednn backend so that it can support more post op fusion for int8 inference. First `Linear - LeakyReLU` fusion is implemented based on previous PRs. Test plan python test_quantization.py TestFuseFx Pull Request resolved: https://github.com/pytorch/pytorch/pull/88665 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2022-12-17 03:33:08 +00:00
Xia, Weiwen	97e47a52b8	[Quant] Add fused linear-leaky_relu op for onednn backend (#88478 ) Summary Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `linear-leaky_relu` op for `onednn` backend, which will be used for int8 inference with `onednn` backend. Cannot call this op with other quantization backends otherwise an error is thrown. Test Plan python test_quantization.py TestQuantizedLinear Pull Request resolved: https://github.com/pytorch/pytorch/pull/88478 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2022-12-06 08:32:59 +00:00
dzdang	ea81138bd6	[quant][improvement][better-engineering] Refactored get_supported_device_types into common_quantization.py (#79607 ) Summary: Both test_quantized_tensor.py and test_quantize_fx.py had the same get_supported_device_types function defined. This PR refactors it into the common_quantization.py file for common usage Test Plan: ``` python test/test_quantization.py ``` Differential Revision: [D37173692](https://our.internmc.facebook.com/intern/diff/D37173692) Pull Request resolved: https://github.com/pytorch/pytorch/pull/79607 Approved by: https://github.com/jerryzh168	2022-09-23 23:32:18 +00:00
zaf	d32a762147	[quant][ao_migration] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic` (#78714 ) Context: In order to avoid the cluttering of the `torch.nn` namespace the quantized modules namespace is moved to `torch.ao.nn`. The list of the `nn.quantized` files that are being migrated: - [ ] `torch.nn.quantized` → `torch.ao.nn.quantized` - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional` - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules` - [X] [Current PR] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic` - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference` - [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable` - [ ] `torch.nn.qat` → `torch.ao.nn.qat` - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules` - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic` - [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic` - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules` - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat` - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized` - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules` - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic` Majority of the files are just moved to the new location. However, specific files need to be double checked: - [Documentation](docs/source/quantization-support.rst) @vkuzo - [Public API test list](test/allowlist_for_publicAPI.json) @peterbell10 - [BC test](test/quantization/bc/test_backward_compatibility.py) @vkuzo - [IR emitter](torch/csrc/jit/frontend/ir_emitter.cpp) @jamesr66a - [JIT serialization](torch/csrc/jit/serialization/import_source.cpp) @IvanKobzarev @jamesr66a Differential Revision: [D36860660](https://our.internmc.facebook.com/intern/diff/D36860660/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36860660/)! Differential Revision: [D36860660](https://our.internmc.facebook.com/intern/diff/D36860660) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78714 Approved by: https://github.com/jerryzh168	2022-08-25 16:50:34 +00:00
zaf	c92e5ac95b	[quant][ao_migration] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules` (#78713 ) Context: In order to avoid the cluttering of the `torch.nn` namespace the quantized modules namespace is moved to `torch.ao.nn`. The list of the `nn.quantized` files that are being migrated: - [ ] `torch.nn.quantized` → `torch.ao.nn.quantized` - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional` - [X] [Current PR] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules` - [ ] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic` - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference` - [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable` - [ ] `torch.nn.qat` → `torch.ao.nn.qat` - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules` - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic` - [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic` - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules` - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat` - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized` - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules` - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic` Majority of the files are just moved to the new location. However, specific files need to be double checked: - Documentation @vkuzo - docs/source/conf.py - docs/source/quantization.rst - [quantize_fx](torch/ao/quantization/quantize_fx.py) @jerryzh168 - [common test routine](test/quantization/ao_migration/common.py) @HDCharles - JIT stuff @jamesr66a - torch/csrc/jit/passes/hoist_conv_packed_params.cpp - torch/csrc/jit/passes/quantization/helper.h - torch/csrc/jit/serialization/import_source.cpp Differential Revision: [D38926012](https://our.internmc.facebook.com/intern/diff/D38926012/) Differential Revision: [D38926012](https://our.internmc.facebook.com/intern/diff/D38926012) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78713 Approved by: https://github.com/jerryzh168	2022-08-25 16:50:33 +00:00
Sergii Dymchenko	591222f5d9	Fix use-dict-literal lint (#83718 ) Fix use-dict-literal pylint suggestions by changing `dict()` to `{}`. This PR should do the change for every Python file except test/jit/test_list_dict.py, where I think the intent is to test the constructor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83718 Approved by: https://github.com/albanD	2022-08-24 00:26:46 +00:00
PyTorch MergeBot	6a9c02339d	Revert "[quant][ao_migration] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules` (#78713 )" This reverts commit 432f037498e3f470f1f6d2a5cc7c6ae8eb4fc870. Reverted https://github.com/pytorch/pytorch/pull/78713 on behalf of https://github.com/janeyx99 due to Reverting for breaking (trunk-only) ios build	2022-08-22 07:32:37 +00:00
PyTorch MergeBot	b1a7b67529	Revert "[quant][ao_migration] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic` (#78714 )" This reverts commit e6fb97d8ae0d2a45e26c9a597426f1ded13d3aec. Reverted https://github.com/pytorch/pytorch/pull/78714 on behalf of https://github.com/janeyx99 due to sorry, reverting so https://github.com/pytorch/pytorch/pull/78713 could be cleanly reverted	2022-08-22 07:30:48 +00:00
zaf	e6fb97d8ae	[quant][ao_migration] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic` (#78714 ) Context: In order to avoid the cluttering of the `torch.nn` namespace the quantized modules namespace is moved to `torch.ao.nn`. The list of the `nn.quantized` files that are being migrated: - [ ] `torch.nn.quantized` → `torch.ao.nn.quantized` - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional` - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules` - [X] [Current PR] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic` - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference` - [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable` - [ ] `torch.nn.qat` → `torch.ao.nn.qat` - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules` - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic` - [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic` - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules` - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat` - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized` - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules` - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic` Majority of the files are just moved to the new location. However, specific files need to be double checked: - [Documentation](docs/source/quantization-support.rst) @vkuzo - [Public API test list](test/allowlist_for_publicAPI.json) @peterbell10 - [BC test](test/quantization/bc/test_backward_compatibility.py) @vkuzo - [IR emitter](torch/csrc/jit/frontend/ir_emitter.cpp) @jamesr66a - [JIT serialization](torch/csrc/jit/serialization/import_source.cpp) @IvanKobzarev @jamesr66a Differential Revision: [D36860660](https://our.internmc.facebook.com/intern/diff/D36860660/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36860660/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/78714 Approved by: https://github.com/jerryzh168	2022-08-22 05:22:00 +00:00
zaf	432f037498	[quant][ao_migration] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules` (#78713 ) Context: In order to avoid the cluttering of the `torch.nn` namespace the quantized modules namespace is moved to `torch.ao.nn`. The list of the `nn.quantized` files that are being migrated: - [ ] `torch.nn.quantized` → `torch.ao.nn.quantized` - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional` - [X] [Current PR] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules` - [ ] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic` - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference` - [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable` - [ ] `torch.nn.qat` → `torch.ao.nn.qat` - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules` - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic` - [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic` - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules` - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat` - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized` - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules` - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic` Majority of the files are just moved to the new location. However, specific files need to be double checked: - Documentation @vkuzo - docs/source/conf.py - docs/source/quantization.rst - [quantize_fx](torch/ao/quantization/quantize_fx.py) @jerryzh168 - [common test routine](test/quantization/ao_migration/common.py) @HDCharles - JIT stuff @jamesr66a - torch/csrc/jit/passes/hoist_conv_packed_params.cpp - torch/csrc/jit/passes/quantization/helper.h - torch/csrc/jit/serialization/import_source.cpp Differential Revision: [D36860145](https://our.internmc.facebook.com/intern/diff/D36860145/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78713 Approved by: https://github.com/jerryzh168	2022-08-22 01:38:55 +00:00
Andrew Or	782f3489c6	[Quant][fx][bc-breaking] Integrate BackendConfig with quantization flow (part 2) (#82557 ) This is part 2 of the effort to replace `backend_config_dict` with a python config object, a more formal and robust API that leads to better user experience. This commit integrates the `BackendConfig` implemented in part 1 (https://github.com/pytorch/pytorch/pull/81469) with the existing FX graph mode quantization flow. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps BC-breaking Notes: Before: ``` import torch from torch.ao.quantization import get_default_qconfig_mapping from torch.ao.quantization.backend_config import ObservationType from torch.ao.quantization.quantize_fx import prepare_fx, convert_fx dtype_config = { "input_dtype": torch.quint8, "output_dtype": torch.quint8 "weight_dtype": torch.qint8, "bias_dtype": torch.float, } backend_config_dict = { "name": "my_backend", "configs": [{ "pattern": torch.nn.Linear, "observation_type": ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT, "dtype_configs": [dtype_config], "root_module": torch.nn.Linear, "reference_quantized_module": torch.nn.quantized._reference.Linear, "qat_module": torch.nn.qat.Linear, }] } m = MyModel() qconfig_mapping = get_default_qconfig_mapping() example_inputs = (torch.rand(3, 3),) m = prepare_fx( m, qconfig_mapping, example_inputs, backend_config_dict=backend_config_dict) m = convert_fx(m, backend_config_dict=backend_config_dict) ``` After: ``` import torch from torch.ao.quantization import get_default_qconfig_mapping from torch.ao.quantization.backend_config import ( BackendConfig, BackendPatternConfig, DTypeConfig, ObservationType, ) from torch.ao.quantization.quantize_fx import prepare_fx, convert_fx dtype_config = DTypeConfig( input_dtype=torch.quint8, output_dtype=torch.quint8 weight_dtype=torch.qint8, bias_dtype=torch.float, ) backend_config = BackendConfig("my_backend").set_backend_pattern_config( BackendPatternConfig(torch.nn.Linear) .set_observation_type(ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT) .add_dtype_config(dtype_config) .set_root_module(torch.nn.Linear) .set_reference_quantized_module(torch.nn.quantized._reference.Linear) .set_qat_module(torch.nn.qat.Linear)) m = MyModel() qconfig_mapping = get_default_qconfig_mapping() example_inputs = (torch.rand(3, 3),) m = prepare_fx(m, qconfig_mapping, example_inputs, backend_config=backend_config) m = convert_fx(m, backend_config=backend_config) ``` Reviewers: jerryzh168 Subscribers: jerryzh168, supriyar Differential Revision: [D38471932](https://our.internmc.facebook.com/intern/diff/D38471932) Pull Request resolved: https://github.com/pytorch/pytorch/pull/82557 Approved by: https://github.com/jerryzh168	2022-08-08 18:55:50 +00:00
Andrew Or	c657c3d3ab	[Quant][fx] Rename convert_to_reference to convert_to_reference_fx (#81326 ) Summary: This commit renames the convert_to_reference function to convert_to_reference_fx, which is more descriptive and matches prepare_fx and prepare_qat_fx better. Test Plan: python test/test_quantization.py TestQuantizeFx Reviewers: jerryzh168 Subscribers: jerryh168 Differential Revision: [D37787876](https://our.internmc.facebook.com/intern/diff/D37787876) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81326 Approved by: https://github.com/jerryzh168	2022-07-13 22:18:46 +00:00
Andrew Or	17104d3d7f	[Quant][fx][bc-breaking] Replace is_reference with convert_to_reference (#80091 ) Summary: This PR removes the is_reference flag from the existing convert_fx API and replaces it with a new convert_to_reference function. This separates (1) converting the prepared model to a reference model from (2) lowering the reference model to a quantized model, enabling users to call their custom lowering function for custom backends. For the native fbgemm backend, for example, the following are equivalent: ``` from torch.ao.quantization.quantize_fx import prepare_fx, convert_fx prepared = prepare_fx(model, ...) quantized = convert_fx(prepared, ...) ``` ``` from torch.ao.quantization.fx import lower_to_fbgemm from torch.ao.quantization.quantize_fx import ( prepare_fx, convert_to_reference ) prepared = prepare_fx(model, ...) reference = convert_to_reference(prepared, ...) quantized = lower_to_fbgemm(reference, ...) ``` Note that currently `lower_to_fbgemm` takes in two other arguments that are difficult for users to provide. A future commit will remove these arguments to make the helper function more user friendly. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Reviewers: jerryzh168, vkuzo Subscribers: jerryzh168, vkuzo Differential Revision: [D37359946](https://our.internmc.facebook.com/intern/diff/D37359946) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80091 Approved by: https://github.com/jerryzh168	2022-06-29 23:01:27 +00:00
Andrew Or	78144b9f35	[Quant][fx][bc-breaking] Replace custom_config_dict with config objects Pull Request resolved: https://github.com/pytorch/pytorch/pull/79066 Following https://github.com/pytorch/pytorch/pull/78452, this commit replaces the following config dicts with python objects: - prepare_custom_config_dict -> PrepareCustomConfig - convert_custom_config_dict -> ConvertCustomConfig - fuse_custom_config_dict -> FuseCustomConfig This leads to better type safety and better user experience in notebook settings due to improved auto completion. The new APIs are as follows: ``` from torch.ao.quantization.fx.custom_config import PrepareCustomConfig prepare_custom_config = PrepareCustomConfig() \ .set_float_to_observed_mapping(float_class, observed_class) \ .set_non_traceable_module_names(["mod1", "mod2"]) \ .set_non_traceable_module_classes([class1, class2]) \ .set_input_quantized_indexes([0, 1]) \ .set_output_quantized_indexes([0]) \ .set_preserved_attributes(["attr1", "attr2"]) convert_custom_config = ConvertCustomConfig() \ .set_observed_to_quantized_mapping(observed_class, quantized_class) \ .set_preserved_attributes(["attr1", "attr2"]) model = prepare_fx( model, qconfig_mapping, example_inputs, prepare_custom_config=prepare_custom_config) model(data) model = convert_fx(model, convert_custom_config=convert_custom_config) ``` For backwards compatibility, prepare_fx, prepare_qat_fx, and convert_fx will continue to accept Dicts, which will be converted to the relevant CustomConfig object internally. Note that this commit does not modify existing tests to use the new API; they will continue to pass in Dicts as before, which still works but triggers a deprecation warning. This will be handled in a future commit. Differential Revision: [D37088095](https://our.internmc.facebook.com/intern/diff/D37088095/) Approved by: https://github.com/jerryzh168	2022-06-16 17:50:07 +00:00
Jerry Zhang	8225f42a8a	[quant][fx][equalization] Fix example_inputs follow ups in test_equalize_fx Summary: as a followup to https://github.com/pytorch/pytorch/pull/76496, we defined model specific example_inputs for the test models in common_quantization.py and used these in test_equalize_fx Test Plan: python test/test_quantization.py TestEqualizeFx Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/78314 Approved by: https://github.com/vkuzo	2022-05-26 01:42:24 +00:00
Jerry Zhang	416899d1a9	[quant][fx][bc-breaking] Add required example_args argument to prepare_fx and prepare_qat_fx (#249 ) (#77608 ) Summary: X-link: https://github.com/facebookresearch/d2go/pull/249 X-link: https://github.com/fairinternal/ClassyVision/pull/104 X-link: https://github.com/pytorch/benchmark/pull/916 X-link: https://github.com/facebookresearch/ClassyVision/pull/791 X-link: https://github.com/facebookresearch/mobile-vision/pull/68 FX Graph Mode Quantization needs to know whether an fx node is a floating point Tensor before it can decide whether to insert observer/fake_quantize module or not, since we only insert observer/fake_quantize module for floating point Tensors. Currently we have some hacks to support this by defining some rules like NON_OBSERVABLE_ARG_DICT (https://github.com/pytorch/pytorch/blob/master/torch/ao/quantization/fx/utils.py#L496), but this approach is fragile and we do not plan to maintain it long term in the pytorch code base. As we discussed in the design review, we'd need to ask users to provide sample args and sample keyword args so that we can infer the type in a more robust way. This PR starts with changing the prepare_fx and prepare_qat_fx api to require user to either provide example arguments thrugh example_inputs, Note this api doesn't support kwargs, kwargs can make https://github.com/pytorch/pytorch/pull/76496#discussion_r861230047 (comment) simpler, but it will be rare, and even then we can still workaround with positional arguments, also torch.jit.trace(https://pytorch.org/docs/stable/generated/torch.jit.trace.html) and ShapeProp: https://github.com/pytorch/pytorch/blob/master/torch/fx/passes/shape_prop.py#L140 just have single positional args, we'll just use a single example_inputs argument for now. If needed, we can extend the api with an optional example_kwargs. e.g. in case when there are a lot of arguments for forward and it makes more sense to pass the arguments by keyword BC-breaking Note: Before: ```python m = resnet18(...) m = prepare_fx(m, qconfig_dict) # or m = prepare_qat_fx(m, qconfig_dict) ``` After: ```python m = resnet18(...) m = prepare_fx(m, qconfig_dict, example_inputs=(torch.randn(1, 3, 224, 224),)) # or m = prepare_qat_fx(m, qconfig_dict, example_inputs=(torch.randn(1, 3, 224, 224),)) ``` Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestQuantizeFxModels Imported from OSS Static Docs Preview: classyvision \|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D35984526/V30/classyvision/)\| \|Modified Pages\| Reviewed By: vkuzo, andrewor14 Differential Revision: D35984526 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77608 Approved by: https://github.com/dzdang	2022-05-21 21:03:48 +00:00
mingfeima	92a9c0e3e0	add channels last (2d) support for mkldnn_convolution (#55584 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55584 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D27941368 Pulled By: VitalyFedyunin fbshipit-source-id: 7dd6f02a5787efa1995f31cdbd3244b25653840c (cherry picked from commit bb555ed0fedafd529cb552807326384e95c90df9)	2022-04-20 22:34:44 +00:00
Thiago Crepaldi	9bbe1d632e	Fix ONNX ATen fallback for non-caffe2 engines This PR introduces 3 BC changes: First, this PR propagates `BUILD_CAFFE2` flag to `libtorch` and `libtorch_python`, which is necessary for non-caffe2 ONNX runtimes when using `ONNX_ATEN_FALLBACK` operator export type. Second, as a complement of https://github.com/pytorch/pytorch/pull/68490, this PR refactors Caffe2's Aten ops symbolics to consider not only the `operator_export_type` (aka `ONNX_ATEN_FALLBACK`) to emit Caffe2 Aten ops, but also whether `BUILD_CAFFE2` (which is called `torch.onnx._CAFFE2_ATEN_FALLBACK` in python binding) is set. Lastly, it renames `onnx::ATen` to `aten::ATen` for ONNX spec consistency in a BC fashion. ONNX doesn't have `ATen` op on its spec, but PyTorch ONNX converter emits them. Non-Caffe2 backend engines would be mislead by such operator's name/domain. A non-ideal workaround would be to have Aten ops handled based on its name and ignore the (non-complaint) domain. Moreover, users could incorrectly file bugs to either ONNX or ONNX Runtime when they inspect the model and notice the presence of an unspecified ONNX operator. Pull Request resolved: https://github.com/pytorch/pytorch/pull/73954 Approved by: https://github.com/BowenBao, https://github.com/malfet, https://github.com/garymm, https://github.com/jiafatom	2022-04-14 23:18:45 +00:00
Digant Desai	09f32eba7a	[quant] Add default symmetric qat qconfig for qnnpack (#74507 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74507 * This is the default symmetric qat qconfigs for qnnpack. * Support for symmetric quantization is not available from other backends. * Observers are similar to symmetric PTQ qconfigs for qnnpack. Reviewed By: jerryzh168 Differential Revision: D34804808 fbshipit-source-id: 22c11b89242a98f54029ac195f7b984e42809164 (cherry picked from commit ea751ded1174ba2c2f061bafc81573faaf248a9a)	2022-03-24 16:19:28 +00:00
Jerry Zhang	7ddf212f33	[quant][fx] Fully align convert with the reference model design and simplify the implementation (#73863 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73863 This PR fully aligns the convert function with the design: https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md and simplifies the implementation of convert function by always produce a reference quantized model (with reference patterns) first, and then lower the model to a quantized model that is runnable with PyTorch native backend (fbgemm/qnnpack). This PR makes the convert.py much easier to understand than the previous implementation, and we are able to remove majority of code in quantization_patterns.py as well (in followup PRs). Test Plan: ``` python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFXNumericSuiteCoreAPIs python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels ``` and other internal/oss regression tests Imported from OSS Reviewed By: andrewor14 Differential Revision: D34778506 fbshipit-source-id: 0678b66addf736039a8749b352f6f569caca962b (cherry picked from commit 33ec9caf23f3ab373d827117efbd9db0668b2437)	2022-03-11 17:11:30 +00:00
Andrew Or	e118d6e59f	Add lowering path for LinearReLU module (#71427 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71427 This commit adds a lowering path for the LinearReLU modules in static quantization mode. This includes torch.nn.qat.Linear, torch.nn.intrinsic.LinearReLU, and torch.nn.intrinsic.qat.LinearReLU. Future commits will add support for dynamic quantization and functional LinearReLU. Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_linear_module Imported from OSS Reviewed By: george-qi Differential Revision: D33694742 fbshipit-source-id: 19af11f82b1ad8ade0c307498971c29a3f776036 (cherry picked from commit b3f607de439f2ba7c0a03ad1ac494127685cbf4e)	2022-02-01 19:31:31 +00:00
Jerry Zhang	082ff25f37	[reland][bc-breaking][quant][be] Refactor fuser_method to include `is_qat` argument" (#71956 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71956 Pull Request resolved: https://github.com/facebookresearch/mobile-vision/pull/59 Original commit changeset: f3912e210e8c Original Phabricator Diff: D33178977 (`ef501e8fed`) Test Plan: Please see original diff for test plans Static Docs Preview: classyvision \|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D33833203/V3/classyvision/)\| \|Modified Pages\| Reviewed By: andrewor14 Differential Revision: D33833203 fbshipit-source-id: 74a8f22730b00aafa6a173b208e635c1d696959e (cherry picked from commit fb88772b18b26141be11f3885af6294eb1bc8466)	2022-01-31 23:02:22 +00:00
Nikita Shulga	56511f859a	Revert D33178977: [bc-breaking][quant][be] Refactor fuser_method to include `is_qat` argument Test Plan: revert-hammer Differential Revision: D33178977 (`ef501e8fed`) Original commit changeset: 0c1499c45526 Original Phabricator Diff: D33178977 (`ef501e8fed`) fbshipit-source-id: f3912e210e8c588fdbdc9c3c5f4acf2aa8fe6678 (cherry picked from commit cd62183414e757b6012522aee01442e818b7b06d)	2022-01-27 03:29:40 +00:00
Jerry Zhang	ef501e8fed	[bc-breaking][quant][be] Refactor fuser_method to include `is_qat` argument (#70009 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70009 Currently we rely on module.training to decide whether we'll do a qat fusion or ptq fusion, this is not ideal since training flag has nothing to do with quantization, this PR introduces an extra flag `is_qat` to control this Note: currently we still has the constraint that when `is_qat` is True, the modules must be in training mode, we can relax this constraint later Test Plan: ``` python test/test_quantization.py TestFuseFx python test/test_quantization.py TestFusion ``` Imported from OSS Static Docs Preview: classyvision \|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D33178977/V36/classyvision/)\| \|Modified Pages\| Reviewed By: mruberry Differential Revision: D33178977 fbshipit-source-id: 0c1499c45526971140d9ad58e2994d1edf5ad770 (cherry picked from commit 2d51f9fb28967f1c5aab260d84b8d32d838f4f26)	2022-01-26 23:33:28 +00:00
Terry Chen	ce3215db70	Fix nnq.dropout in vision mobilenetv3 pretrain model (#71438 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71438 Fix issue https://github.com/pytorch/vision/issues/5198 skip observer for nn.dropout to load pretrain model Test Plan: python -c "import torchvision; torchvision.models.quantization.mobilenet_v3_large(pretrained=True, quantize=True)" Imported from OSS Reviewed By: HDCharles Differential Revision: D33641707 fbshipit-source-id: 14ea26557c4ff3b942cf46bf06610db0b8f06b05 (cherry picked from commit 0b8b178d261431e604165f2419e95a32c7ecc6b2)	2022-01-22 00:02:48 +00:00
Terry Chen	e7c87e8b44	[quant] fix dropout in FX graph mode quantization (#71043 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71043 fix issue #68250 dropout break fx graph model quantization Test Plan: python test/test_quantization.py TestStaticQuantizedModule Imported from OSS Reviewed By: vkuzo Differential Revision: D33490176 fbshipit-source-id: 155546505b28ffc635ada65a1464b9d622dbc235	2022-01-13 15:59:59 -08:00
Jon Morton	123be0e5b7	[fusion] Add ConvTranspose+BN fusion support (#70022 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70022 Add support for fusing ConvTranpose{1,2,3}d with BatchNorm{1,2,3}d. This re-uses the existing fusion logic but adds a "transpose" flag to the fusing function which when enabled will use the appropriate reshape for ConTranspose's transposed weights. Test Plan: `buck test mode/dev //caffe2/test:quantization -- -r quantization.eager.test_fusion.TestFusion` Reviewed By: jerryzh168 Differential Revision: D33074405 fbshipit-source-id: 5e9eff1a06d8f98d117e7d18e80da8e842e973b7	2021-12-20 18:42:48 -08:00
Jerry Zhang	5db711f9d3	[quant][be] Replace QConfigDynamic with QConfig in code (#69864 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69864 att, will have a follow up PR that removes QConfigDynamic in the api Test Plan: regression tests ``` python test/test_quantization.py TestPostTrainingStatic python test/test_quantization.py TestPostTrainingDynamic python test/test_quantization.py TestQuantizeFx ``` Imported from OSS Reviewed By: vkuzo Differential Revision: D33073235 fbshipit-source-id: 6c1a1647032453803c55cdad7c04154502f085db	2021-12-17 22:30:57 -08:00
Jerry Zhang	f575179953	[quant][fx][graphmode] Move more patterns to use ModuleReLU fuse handler (#69644 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69644 This PR cleans up the init of ModuleReLUFuseHandler and moved all `module - relu` fusion pattern to use this handler also disabled additional_fuser_method argument temporarily, will enable after we bring back the simple pattern format Test Plan: ``` python test/test_quantize_fx.py TestFuseFx ``` Imported from OSS Reviewed By: vkuzo Differential Revision: D32974906 fbshipit-source-id: 23483ea4293d569cb3cec6dadfefd4d9f30921a7	2021-12-11 22:00:06 -08:00
Ben Koopman	6c9cf5e6ea	[quant][embedding qat] eager mode QAT for Embeddings (#66429 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66429 Test Plan: Imported from OSS Reviewed By: HDCharles, supriyar Differential Revision: D31618284 Pulled By: b-koopman fbshipit-source-id: 0c0e2e86b98da9f29e9b2fc2a35c59424f94cbba	2021-11-18 05:57:11 -08:00
andrewor	4a8f27445d	[Quant] Add dynamic QAT Linear module (#67325 ) Summary: Summary: This commit adds the `torch.nn.qat.dynamic.modules.Linear` module, the dynamic counterpart to `torch.nn.qat.modules.Linear`. Functionally these are very similar, except the dynamic version expects a memoryless observer and is converted into a dynamically quantized module before inference. Pull Request resolved: https://github.com/pytorch/pytorch/pull/67325 Test Plan: `python3 test/test_quantization.py TestQuantizationAwareTraining.test_dynamic_qat_linear` Reviewers: Charles David Hernandez, Jerry Zhang Subscribers: Charles David Hernandez, Supriya Rao, Yining Lu Tasks: 99696812 Tags: pytorch Reviewed By: malfet, jerryzh168 Differential Revision: D32178739 Pulled By: andrewor14 fbshipit-source-id: 5051bdd7e06071a011e4e7d9cc7769db8d38fd73	2021-11-08 10:24:25 -08:00
Ben Koopman	0036e41143	[quant][embedding qat] Add eager QAT test for EmbeddingBag+Linear model (#66334 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66334 Test Plan: Imported from OSS Reviewed By: HDCharles Differential Revision: D31618283 Pulled By: b-koopman fbshipit-source-id: bb824a341f1aa9d7e83f8e66d320a9dfd348a1d7	2021-10-19 07:03:36 -07:00
Teng Zhang	40794dbb25	add backend_config_dict to checkGraphModeFxOp (#66499 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66499 Test Plan: Imported from OSS Reviewed By: jerryzh168 Differential Revision: D31582518 Pulled By: rahxephon89 fbshipit-source-id: b8107bb7140517f2dc32bf692c6b916536ea35c3	2021-10-12 18:35:54 -07:00
Supriya Rao	8a974a482c	[quant] Add support for quantization of Embedding{Bag} in dynamic quant APIs (#65674 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65674 Before this PR user had to use the eager mode static quantization APIs to quantize Embedding/EmbeddingBag modules. With this PR they can use either the static or dynamic quantization APIs for Embedding quantization The only qconfig supported for embedding quantization is float_qparams_weight_only_qconfig whcih is currently enforced in the from_float method of the quantized Embedding/Embedding modules. To combine embedding quantization with Linear dynamic quantization, user can use the qconfig_dict to specify different qconfig for each module type. The prepare/convert APIs can still be used to quantize Embeddings, with the caveat that user need to ensure input to Embedding ops are FP32. Addresses Issue #65185 ghstack-source-id: 139935419 Test Plan: python test/test_quantization.py Imported from OSS Reviewed By: gchanan Differential Revision: D31211199 fbshipit-source-id: 8c747881caee5ccbf8b93c6704b08d132049dea4	2021-10-06 23:19:38 -07:00
Zafar Takhirov	c151d62f45	[quant] AO migration of the `quant_types.py` (phase 1) (#64916 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64916 AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly. This migrates the quant_type.py from torch.quantization to torch.ao.quantization. At this point both locations will be supported. Eventually the torch.quantization will be deprecated. Test Plan: `buck test mode/dev //caffe2/test:quantization -- TestAOMigrationQuantization` Reviewed By: vkuzo Differential Revision: D30898422 fbshipit-source-id: 3e6126b49f0565a4136d6928cea9eb25368927ff	2021-09-15 17:30:00 -07:00
Vasiliy Kuznetsov	1577c106dc	torch.ao migration: numeric suite, eager and fx (#64817 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64817 This migrates `torch.quantization._numeric_suite` to `torch.ao.ns._numeric_suite`, and `torch.quantization._numeric_suite_fx` to `torch.ao.ns._numeric_suite_fx`. 1. move the files ``` HG: move eager mode hg mv caffe2/torch/quantization/_numeric_suite.py caffe2/torch/ao/ns/ HG: move fx hg mv caffe2/torch/quantization/_numeric_suite_fx.py caffe2/torch/ao/ns/ hg mv caffe2/torch/quantization/ns/* caffe2/torch/ao/ns/fx/ ``` 2. create new versions of `_numeric_suite.py` and `_numeric_suite_fx.py` with imports 3. update all FB callsites Test Plan: buck test mode/dev //caffe2/test:quantization Reviewed By: z-a-f Differential Revision: D30867538 fbshipit-source-id: 120ee830434ca490c1183a187a518eebcbbaf22c	2021-09-12 12:00:45 -07:00
Supriya Rao	c7027f19ef	[quant][fx] Add support for dynamic linear + relu fusion (INT8) (#63799 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63799 Add a new module that can be used for module swap with the nni.LinearReLU module in convert function. Supports INT8 currently (since FP16 op doesn't have relu fusion yet). Fixes #55393 Test Plan: python test/test_quantization.py test_dynamic_fusion Imported from OSS Reviewed By: heitorschueroff Differential Revision: D30502812 fbshipit-source-id: 3668e4f001a0626d469e17ac323acf582ee28a51	2021-08-26 21:10:46 -07:00
Philip Meier	99203580a9	Updates internal `assert_allclose` callsites in favor of `assert_close` (#61841 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61841 Redo of #60863. Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D30408145 Pulled By: mruberry fbshipit-source-id: 0b34ebc7f23ba38ecd89640b61d8aca59b7eab58	2021-08-19 12:50:41 -07:00
Jerry Zhang	ddd916c210	[quant][refactor] Return the models in checkGraphModeFxOp for further checking (#62487 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62487 checkGraphModeFxOp is our utility test function to quantize a given model with FX Graph Mode Quantization and checks whether the result model contains expected ops, previously it only returns a result on the sample data for the quantized model, this PR chagnes it to return prepared, quantized, quantized_reference models together with the result for quantized models. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: iramazanli Differential Revision: D30053981 fbshipit-source-id: 31fbce48d138261d0b00ba24e1427fd0c6208990	2021-08-03 10:12:16 -07:00
Jerry Zhang	7507aeded5	[reland][bc-breaking] reference option for linear produce a pattern instead of reference linear module (#61892 ) (#62277 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62277 This PR changes is_reference=True for linear to produce a pattern consists of dequant - float linear - quant instead of reference linear module, this is useful for future transformations to custom backends, it is also helpful to simplify the implementation for convert in the future. Test Plan: python test/test_quantization.py TestQuantizeFxOps Imported from OSS Imported from OSS Reviewed By: ejguan Differential Revision: D29941079 fbshipit-source-id: 84bdfc0bb872c34fc345875e545c8b323e77c41e	2021-07-27 15:46:44 -07:00
Erjia Guan	8cdf16d1de	Revert D29810657: [bc-breaking] reference option for linear produce a pattern instead of reference linear module Test Plan: revert-hammer Differential Revision: D29810657 (`9df605133e`) Original commit changeset: 949615bbc017 fbshipit-source-id: 54597d1f9636b0f94ae01c66018ff2592e5c39fc	2021-07-27 10:10:13 -07:00
Jerry Zhang	9df605133e	[bc-breaking] reference option for linear produce a pattern instead of reference linear module (#61892 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61892 This PR changes is_reference=True for linear to produce a pattern consists of dequant - float linear - quant instead of reference linear module, this is useful for future transformations to custom backends, it is also helpful to simplify the implementation for convert in the future. Test Plan: python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D29810657 fbshipit-source-id: 949615bbc017bc454d81c8a6b2bdec53badaab19	2021-07-27 09:49:20 -07:00
Vasiliy Kuznetsov	07c6a12008	ns for fx: fix typing issue in weight extraction (#62041 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62041 Before this PR, weights of conv and linear modules were extracted as lists, in order to match the signature of LSTM weights. After this PR, weight extraction preserves the type of the weights, so extracted weights of conv and linear have a different type from LSTM weights. The comparison util functions are updated to handle the LSTM weight type of `List[tensor]`. Test Plan: ``` python test/test_quantization.py TestFXNumericSuiteCoreAPIs python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D29853626 fbshipit-source-id: 93da5b9b0b174679c61528d02b6b902cb064444e	2021-07-23 09:31:33 -07:00
Vitaly Fedyunin	b60d1b713e	Revert D26007050: add channels last support for thnn_conv2d (non-dilated) Test Plan: revert-hammer Differential Revision: D26007050 (`8b88c24670`) Original commit changeset: 1289e0687c24 fbshipit-source-id: 88b679efbcae572fe604d50e2199861cadbc3d4a	2021-07-22 08:31:15 -07:00
mingfeima	8b88c24670	add channels last support for thnn_conv2d (non-dilated) (#49582 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49582 Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D26007050 Pulled By: VitalyFedyunin fbshipit-source-id: 1289e0687c2459dd4eb8e4ba2efc8266397cfe5f	2021-07-20 12:50:24 -07:00
Angela Yi	0751a41ab1	[quant] Input-Weight Equalization - ConvReLU support (#61350 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61350 Applied changes in convert to allow for ConvReLU2d layers Initial Model: `x -> conv1 -> relu` After fusion: `x -> convRelu2d` After prepare: `x -> input_quant_obs -> input_eq_obs1 -> convRelu2d -> output_quant_obs1` After equalization functions: `x -> mul -> input_quant_obs (scaled) -> convRelu2d -> output_quant_obs` After convert: `x -> mul -> quantize_per_tensor -> quantized::convRelu2d -> dequantize` Test Plan: `python test/test_quantization.py TestEqualizeFx` Initial Model: ``` ConvReluModel( (fc): Conv2d(3, 5, kernel_size=(3, 3), stride=(1, 1)) (relu): ReLU() ) ``` After prepare: ``` GraphModule( (x_activation_post_process_0): MinMaxObserver(min_val=5.960464477539063e-08, max_val=0.9999999403953552) (x_activation_post_process_0_equalization_process_0): _InputEqualizationObserver( (input_obs): PerChannelMinMaxObserver(min_val=tensor([1.1921e-07, 3.3379e-06, 5.9605e-08]), max_val=tensor([1.0000, 1.0000, 1.0000])) ) (fc): ConvReLU2d( (0): Conv2d(3, 5, kernel_size=(3, 3), stride=(1, 1)) (1): ReLU() ) (fc_activation_post_process_0): MinMaxObserver(min_val=0.0, max_val=1.2341605424880981) ) graph(): %x : [#users=1] = placeholder[target=x] %x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_0](args = (%x,), kwargs = {}) %x_activation_post_process_0_equalization_process_0 : [#users=1] = call_module[target=x_activation_post_process_0_equalization_process_0](args = (%x_activation_post_process_0,), kwargs = {}) %fc : [#users=1] = call_module[target=fc](args = (%x_activation_post_process_0_equalization_process_0,), kwargs = {}) %fc_activation_post_process_0 : [#users=1] = call_module[target=fc_activation_post_process_0](args = (%fc,), kwargs = {}) return fc_activation_post_process_0 ``` After equalization functions: ``` graph(): %x : [#users=1] = placeholder[target=x] %x_equalization_scale0 : [#users=1] = get_attr[target=x_equalization_scale0] %mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_equalization_scale0), kwargs = {}) %x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_0](args = (%mul,), kwargs = {}) %fc : [#users=1] = call_module[target=fc](args = (%x_activation_post_process_0,), kwargs = {}) %fc_activation_post_process_0 : [#users=1] = call_module[target=fc_activation_post_process_0](args = (%fc,), kwargs = {}) return fc_activation_post_process_0 ``` After convert: ``` graph(): %x : [#users=1] = placeholder[target=x] %x_equalization_scale0 : [#users=1] = get_attr[target=x_equalization_scale0] %mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_equalization_scale0), kwargs = {}) %fc_input_scale_0 : [#users=1] = get_attr[target=fc_input_scale_0] %fc_input_zero_point_0 : [#users=1] = get_attr[target=fc_input_zero_point_0] %quantize_per_tensor : [#users=1] = call_function[target=torch.quantize_per_tensor](args = (%mul, %fc_input_scale_0, %fc_input_zero_point_0, torch.quint8), kwargs = {}) %fc : [#users=1] = call_module[target=fc](args = (%quantize_per_tensor,), kwargs = {}) %dequantize : [#users=1] = call_method[target=dequantize](args = (%fc,), kwargs = {}) return dequantize ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D29638275 fbshipit-source-id: 40d4666a4451e132612ea38fdfeaaec177a1defb	2021-07-13 14:00:40 -07:00
Angela Yi	77d36b657a	[quant] Input-Weight Equalization - Conv prepare support (#61286 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61286 Modifies the prepare step to support conv layers during input-weight equalization and adds tests to make sure that the results are as expected. Initial: ``` w \| x -> conv -> y ``` After prepare: ``` w \| weight_quant_obs \| weight_eq_obs \| x -> input_quant_obs -> input_eq_obs -> conv -> out_quant_obs -> y ``` Test Plan: `python test/test_quantization.py TestEqualizeFx.test_input_weight_equalization_prepare` Initial: ``` ConvModel( (conv): Conv2d(3, 5, kernel_size=(3, 3), stride=(1, 1), bias=False) ) ``` After prepare: ``` graph(): %x : [#users=1] = placeholder[target=x] %x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_0](args = (%x,), kwargs = {}) %x_activation_post_process_0_equalization_process_0 : [#users=1] = call_module[target=x_activation_post_process_0_equalization_process_0](args = (%x_activation_post_process_0,), kwargs = {}) %conv : [#users=1] = call_module[target=conv](args = (%x_activation_post_process_0_equalization_process_0,), kwargs = {}) %conv_activation_post_process_0 : [#users=1] = call_module[target=conv_activation_post_process_0](args = (%conv,), kwargs = {}) return conv_activation_post_process_0 ``` Imported from OSS Reviewed By: supriyar Differential Revision: D29557051 fbshipit-source-id: 25d1531645dfaf565f5c615e2ee850fcf96c7eb9	2021-07-13 14:00:36 -07:00

1 2 3 4

188 Commits