pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-29 19:24:55 +08:00

Author	SHA1	Message	Date
mingfeima	92a9c0e3e0	add channels last (2d) support for mkldnn_convolution (#55584 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55584 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D27941368 Pulled By: VitalyFedyunin fbshipit-source-id: 7dd6f02a5787efa1995f31cdbd3244b25653840c (cherry picked from commit bb555ed0fedafd529cb552807326384e95c90df9)	2022-04-20 22:34:44 +00:00
Thiago Crepaldi	9bbe1d632e	Fix ONNX ATen fallback for non-caffe2 engines This PR introduces 3 BC changes: First, this PR propagates `BUILD_CAFFE2` flag to `libtorch` and `libtorch_python`, which is necessary for non-caffe2 ONNX runtimes when using `ONNX_ATEN_FALLBACK` operator export type. Second, as a complement of https://github.com/pytorch/pytorch/pull/68490, this PR refactors Caffe2's Aten ops symbolics to consider not only the `operator_export_type` (aka `ONNX_ATEN_FALLBACK`) to emit Caffe2 Aten ops, but also whether `BUILD_CAFFE2` (which is called `torch.onnx._CAFFE2_ATEN_FALLBACK` in python binding) is set. Lastly, it renames `onnx::ATen` to `aten::ATen` for ONNX spec consistency in a BC fashion. ONNX doesn't have `ATen` op on its spec, but PyTorch ONNX converter emits them. Non-Caffe2 backend engines would be mislead by such operator's name/domain. A non-ideal workaround would be to have Aten ops handled based on its name and ignore the (non-complaint) domain. Moreover, users could incorrectly file bugs to either ONNX or ONNX Runtime when they inspect the model and notice the presence of an unspecified ONNX operator. Pull Request resolved: https://github.com/pytorch/pytorch/pull/73954 Approved by: https://github.com/BowenBao, https://github.com/malfet, https://github.com/garymm, https://github.com/jiafatom	2022-04-14 23:18:45 +00:00
Digant Desai	09f32eba7a	[quant] Add default symmetric qat qconfig for qnnpack (#74507 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74507 * This is the default symmetric qat qconfigs for qnnpack. * Support for symmetric quantization is not available from other backends. * Observers are similar to symmetric PTQ qconfigs for qnnpack. Reviewed By: jerryzh168 Differential Revision: D34804808 fbshipit-source-id: 22c11b89242a98f54029ac195f7b984e42809164 (cherry picked from commit ea751ded1174ba2c2f061bafc81573faaf248a9a)	2022-03-24 16:19:28 +00:00
Jerry Zhang	7ddf212f33	[quant][fx] Fully align convert with the reference model design and simplify the implementation (#73863 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73863 This PR fully aligns the convert function with the design: https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md and simplifies the implementation of convert function by always produce a reference quantized model (with reference patterns) first, and then lower the model to a quantized model that is runnable with PyTorch native backend (fbgemm/qnnpack). This PR makes the convert.py much easier to understand than the previous implementation, and we are able to remove majority of code in quantization_patterns.py as well (in followup PRs). Test Plan: ``` python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFXNumericSuiteCoreAPIs python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels ``` and other internal/oss regression tests Imported from OSS Reviewed By: andrewor14 Differential Revision: D34778506 fbshipit-source-id: 0678b66addf736039a8749b352f6f569caca962b (cherry picked from commit 33ec9caf23f3ab373d827117efbd9db0668b2437)	2022-03-11 17:11:30 +00:00
Andrew Or	e118d6e59f	Add lowering path for LinearReLU module (#71427 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71427 This commit adds a lowering path for the LinearReLU modules in static quantization mode. This includes torch.nn.qat.Linear, torch.nn.intrinsic.LinearReLU, and torch.nn.intrinsic.qat.LinearReLU. Future commits will add support for dynamic quantization and functional LinearReLU. Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_linear_module Imported from OSS Reviewed By: george-qi Differential Revision: D33694742 fbshipit-source-id: 19af11f82b1ad8ade0c307498971c29a3f776036 (cherry picked from commit b3f607de439f2ba7c0a03ad1ac494127685cbf4e)	2022-02-01 19:31:31 +00:00
Jerry Zhang	082ff25f37	[reland][bc-breaking][quant][be] Refactor fuser_method to include `is_qat` argument" (#71956 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71956 Pull Request resolved: https://github.com/facebookresearch/mobile-vision/pull/59 Original commit changeset: f3912e210e8c Original Phabricator Diff: D33178977 (`ef501e8fed`) Test Plan: Please see original diff for test plans Static Docs Preview: classyvision \|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D33833203/V3/classyvision/)\| \|Modified Pages\| Reviewed By: andrewor14 Differential Revision: D33833203 fbshipit-source-id: 74a8f22730b00aafa6a173b208e635c1d696959e (cherry picked from commit fb88772b18b26141be11f3885af6294eb1bc8466)	2022-01-31 23:02:22 +00:00
Nikita Shulga	56511f859a	Revert D33178977: [bc-breaking][quant][be] Refactor fuser_method to include `is_qat` argument Test Plan: revert-hammer Differential Revision: D33178977 (`ef501e8fed`) Original commit changeset: 0c1499c45526 Original Phabricator Diff: D33178977 (`ef501e8fed`) fbshipit-source-id: f3912e210e8c588fdbdc9c3c5f4acf2aa8fe6678 (cherry picked from commit cd62183414e757b6012522aee01442e818b7b06d)	2022-01-27 03:29:40 +00:00
Jerry Zhang	ef501e8fed	[bc-breaking][quant][be] Refactor fuser_method to include `is_qat` argument (#70009 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70009 Currently we rely on module.training to decide whether we'll do a qat fusion or ptq fusion, this is not ideal since training flag has nothing to do with quantization, this PR introduces an extra flag `is_qat` to control this Note: currently we still has the constraint that when `is_qat` is True, the modules must be in training mode, we can relax this constraint later Test Plan: ``` python test/test_quantization.py TestFuseFx python test/test_quantization.py TestFusion ``` Imported from OSS Static Docs Preview: classyvision \|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D33178977/V36/classyvision/)\| \|Modified Pages\| Reviewed By: mruberry Differential Revision: D33178977 fbshipit-source-id: 0c1499c45526971140d9ad58e2994d1edf5ad770 (cherry picked from commit 2d51f9fb28967f1c5aab260d84b8d32d838f4f26)	2022-01-26 23:33:28 +00:00
Terry Chen	ce3215db70	Fix nnq.dropout in vision mobilenetv3 pretrain model (#71438 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71438 Fix issue https://github.com/pytorch/vision/issues/5198 skip observer for nn.dropout to load pretrain model Test Plan: python -c "import torchvision; torchvision.models.quantization.mobilenet_v3_large(pretrained=True, quantize=True)" Imported from OSS Reviewed By: HDCharles Differential Revision: D33641707 fbshipit-source-id: 14ea26557c4ff3b942cf46bf06610db0b8f06b05 (cherry picked from commit 0b8b178d261431e604165f2419e95a32c7ecc6b2)	2022-01-22 00:02:48 +00:00
Terry Chen	e7c87e8b44	[quant] fix dropout in FX graph mode quantization (#71043 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71043 fix issue #68250 dropout break fx graph model quantization Test Plan: python test/test_quantization.py TestStaticQuantizedModule Imported from OSS Reviewed By: vkuzo Differential Revision: D33490176 fbshipit-source-id: 155546505b28ffc635ada65a1464b9d622dbc235	2022-01-13 15:59:59 -08:00
Jon Morton	123be0e5b7	[fusion] Add ConvTranspose+BN fusion support (#70022 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70022 Add support for fusing ConvTranpose{1,2,3}d with BatchNorm{1,2,3}d. This re-uses the existing fusion logic but adds a "transpose" flag to the fusing function which when enabled will use the appropriate reshape for ConTranspose's transposed weights. Test Plan: `buck test mode/dev //caffe2/test:quantization -- -r quantization.eager.test_fusion.TestFusion` Reviewed By: jerryzh168 Differential Revision: D33074405 fbshipit-source-id: 5e9eff1a06d8f98d117e7d18e80da8e842e973b7	2021-12-20 18:42:48 -08:00
Jerry Zhang	5db711f9d3	[quant][be] Replace QConfigDynamic with QConfig in code (#69864 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69864 att, will have a follow up PR that removes QConfigDynamic in the api Test Plan: regression tests ``` python test/test_quantization.py TestPostTrainingStatic python test/test_quantization.py TestPostTrainingDynamic python test/test_quantization.py TestQuantizeFx ``` Imported from OSS Reviewed By: vkuzo Differential Revision: D33073235 fbshipit-source-id: 6c1a1647032453803c55cdad7c04154502f085db	2021-12-17 22:30:57 -08:00
Jerry Zhang	f575179953	[quant][fx][graphmode] Move more patterns to use ModuleReLU fuse handler (#69644 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69644 This PR cleans up the init of ModuleReLUFuseHandler and moved all `module - relu` fusion pattern to use this handler also disabled additional_fuser_method argument temporarily, will enable after we bring back the simple pattern format Test Plan: ``` python test/test_quantize_fx.py TestFuseFx ``` Imported from OSS Reviewed By: vkuzo Differential Revision: D32974906 fbshipit-source-id: 23483ea4293d569cb3cec6dadfefd4d9f30921a7	2021-12-11 22:00:06 -08:00
Ben Koopman	6c9cf5e6ea	[quant][embedding qat] eager mode QAT for Embeddings (#66429 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66429 Test Plan: Imported from OSS Reviewed By: HDCharles, supriyar Differential Revision: D31618284 Pulled By: b-koopman fbshipit-source-id: 0c0e2e86b98da9f29e9b2fc2a35c59424f94cbba	2021-11-18 05:57:11 -08:00
andrewor	4a8f27445d	[Quant] Add dynamic QAT Linear module (#67325 ) Summary: Summary: This commit adds the `torch.nn.qat.dynamic.modules.Linear` module, the dynamic counterpart to `torch.nn.qat.modules.Linear`. Functionally these are very similar, except the dynamic version expects a memoryless observer and is converted into a dynamically quantized module before inference. Pull Request resolved: https://github.com/pytorch/pytorch/pull/67325 Test Plan: `python3 test/test_quantization.py TestQuantizationAwareTraining.test_dynamic_qat_linear` Reviewers: Charles David Hernandez, Jerry Zhang Subscribers: Charles David Hernandez, Supriya Rao, Yining Lu Tasks: 99696812 Tags: pytorch Reviewed By: malfet, jerryzh168 Differential Revision: D32178739 Pulled By: andrewor14 fbshipit-source-id: 5051bdd7e06071a011e4e7d9cc7769db8d38fd73	2021-11-08 10:24:25 -08:00
Ben Koopman	0036e41143	[quant][embedding qat] Add eager QAT test for EmbeddingBag+Linear model (#66334 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66334 Test Plan: Imported from OSS Reviewed By: HDCharles Differential Revision: D31618283 Pulled By: b-koopman fbshipit-source-id: bb824a341f1aa9d7e83f8e66d320a9dfd348a1d7	2021-10-19 07:03:36 -07:00
Teng Zhang	40794dbb25	add backend_config_dict to checkGraphModeFxOp (#66499 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66499 Test Plan: Imported from OSS Reviewed By: jerryzh168 Differential Revision: D31582518 Pulled By: rahxephon89 fbshipit-source-id: b8107bb7140517f2dc32bf692c6b916536ea35c3	2021-10-12 18:35:54 -07:00
Supriya Rao	8a974a482c	[quant] Add support for quantization of Embedding{Bag} in dynamic quant APIs (#65674 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65674 Before this PR user had to use the eager mode static quantization APIs to quantize Embedding/EmbeddingBag modules. With this PR they can use either the static or dynamic quantization APIs for Embedding quantization The only qconfig supported for embedding quantization is float_qparams_weight_only_qconfig whcih is currently enforced in the from_float method of the quantized Embedding/Embedding modules. To combine embedding quantization with Linear dynamic quantization, user can use the qconfig_dict to specify different qconfig for each module type. The prepare/convert APIs can still be used to quantize Embeddings, with the caveat that user need to ensure input to Embedding ops are FP32. Addresses Issue #65185 ghstack-source-id: 139935419 Test Plan: python test/test_quantization.py Imported from OSS Reviewed By: gchanan Differential Revision: D31211199 fbshipit-source-id: 8c747881caee5ccbf8b93c6704b08d132049dea4	2021-10-06 23:19:38 -07:00
Zafar Takhirov	c151d62f45	[quant] AO migration of the `quant_types.py` (phase 1) (#64916 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64916 AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly. This migrates the quant_type.py from torch.quantization to torch.ao.quantization. At this point both locations will be supported. Eventually the torch.quantization will be deprecated. Test Plan: `buck test mode/dev //caffe2/test:quantization -- TestAOMigrationQuantization` Reviewed By: vkuzo Differential Revision: D30898422 fbshipit-source-id: 3e6126b49f0565a4136d6928cea9eb25368927ff	2021-09-15 17:30:00 -07:00
Vasiliy Kuznetsov	1577c106dc	torch.ao migration: numeric suite, eager and fx (#64817 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64817 This migrates `torch.quantization._numeric_suite` to `torch.ao.ns._numeric_suite`, and `torch.quantization._numeric_suite_fx` to `torch.ao.ns._numeric_suite_fx`. 1. move the files ``` HG: move eager mode hg mv caffe2/torch/quantization/_numeric_suite.py caffe2/torch/ao/ns/ HG: move fx hg mv caffe2/torch/quantization/_numeric_suite_fx.py caffe2/torch/ao/ns/ hg mv caffe2/torch/quantization/ns/* caffe2/torch/ao/ns/fx/ ``` 2. create new versions of `_numeric_suite.py` and `_numeric_suite_fx.py` with imports 3. update all FB callsites Test Plan: buck test mode/dev //caffe2/test:quantization Reviewed By: z-a-f Differential Revision: D30867538 fbshipit-source-id: 120ee830434ca490c1183a187a518eebcbbaf22c	2021-09-12 12:00:45 -07:00
Supriya Rao	c7027f19ef	[quant][fx] Add support for dynamic linear + relu fusion (INT8) (#63799 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63799 Add a new module that can be used for module swap with the nni.LinearReLU module in convert function. Supports INT8 currently (since FP16 op doesn't have relu fusion yet). Fixes #55393 Test Plan: python test/test_quantization.py test_dynamic_fusion Imported from OSS Reviewed By: heitorschueroff Differential Revision: D30502812 fbshipit-source-id: 3668e4f001a0626d469e17ac323acf582ee28a51	2021-08-26 21:10:46 -07:00
Philip Meier	99203580a9	Updates internal `assert_allclose` callsites in favor of `assert_close` (#61841 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61841 Redo of #60863. Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D30408145 Pulled By: mruberry fbshipit-source-id: 0b34ebc7f23ba38ecd89640b61d8aca59b7eab58	2021-08-19 12:50:41 -07:00
Jerry Zhang	ddd916c210	[quant][refactor] Return the models in checkGraphModeFxOp for further checking (#62487 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62487 checkGraphModeFxOp is our utility test function to quantize a given model with FX Graph Mode Quantization and checks whether the result model contains expected ops, previously it only returns a result on the sample data for the quantized model, this PR chagnes it to return prepared, quantized, quantized_reference models together with the result for quantized models. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: iramazanli Differential Revision: D30053981 fbshipit-source-id: 31fbce48d138261d0b00ba24e1427fd0c6208990	2021-08-03 10:12:16 -07:00
Jerry Zhang	7507aeded5	[reland][bc-breaking] reference option for linear produce a pattern instead of reference linear module (#61892 ) (#62277 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62277 This PR changes is_reference=True for linear to produce a pattern consists of dequant - float linear - quant instead of reference linear module, this is useful for future transformations to custom backends, it is also helpful to simplify the implementation for convert in the future. Test Plan: python test/test_quantization.py TestQuantizeFxOps Imported from OSS Imported from OSS Reviewed By: ejguan Differential Revision: D29941079 fbshipit-source-id: 84bdfc0bb872c34fc345875e545c8b323e77c41e	2021-07-27 15:46:44 -07:00
Erjia Guan	8cdf16d1de	Revert D29810657: [bc-breaking] reference option for linear produce a pattern instead of reference linear module Test Plan: revert-hammer Differential Revision: D29810657 (`9df605133e`) Original commit changeset: 949615bbc017 fbshipit-source-id: 54597d1f9636b0f94ae01c66018ff2592e5c39fc	2021-07-27 10:10:13 -07:00
Jerry Zhang	9df605133e	[bc-breaking] reference option for linear produce a pattern instead of reference linear module (#61892 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61892 This PR changes is_reference=True for linear to produce a pattern consists of dequant - float linear - quant instead of reference linear module, this is useful for future transformations to custom backends, it is also helpful to simplify the implementation for convert in the future. Test Plan: python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D29810657 fbshipit-source-id: 949615bbc017bc454d81c8a6b2bdec53badaab19	2021-07-27 09:49:20 -07:00
Vasiliy Kuznetsov	07c6a12008	ns for fx: fix typing issue in weight extraction (#62041 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62041 Before this PR, weights of conv and linear modules were extracted as lists, in order to match the signature of LSTM weights. After this PR, weight extraction preserves the type of the weights, so extracted weights of conv and linear have a different type from LSTM weights. The comparison util functions are updated to handle the LSTM weight type of `List[tensor]`. Test Plan: ``` python test/test_quantization.py TestFXNumericSuiteCoreAPIs python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D29853626 fbshipit-source-id: 93da5b9b0b174679c61528d02b6b902cb064444e	2021-07-23 09:31:33 -07:00
Vitaly Fedyunin	b60d1b713e	Revert D26007050: add channels last support for thnn_conv2d (non-dilated) Test Plan: revert-hammer Differential Revision: D26007050 (`8b88c24670`) Original commit changeset: 1289e0687c24 fbshipit-source-id: 88b679efbcae572fe604d50e2199861cadbc3d4a	2021-07-22 08:31:15 -07:00
mingfeima	8b88c24670	add channels last support for thnn_conv2d (non-dilated) (#49582 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49582 Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D26007050 Pulled By: VitalyFedyunin fbshipit-source-id: 1289e0687c2459dd4eb8e4ba2efc8266397cfe5f	2021-07-20 12:50:24 -07:00
Angela Yi	0751a41ab1	[quant] Input-Weight Equalization - ConvReLU support (#61350 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61350 Applied changes in convert to allow for ConvReLU2d layers Initial Model: `x -> conv1 -> relu` After fusion: `x -> convRelu2d` After prepare: `x -> input_quant_obs -> input_eq_obs1 -> convRelu2d -> output_quant_obs1` After equalization functions: `x -> mul -> input_quant_obs (scaled) -> convRelu2d -> output_quant_obs` After convert: `x -> mul -> quantize_per_tensor -> quantized::convRelu2d -> dequantize` Test Plan: `python test/test_quantization.py TestEqualizeFx` Initial Model: ``` ConvReluModel( (fc): Conv2d(3, 5, kernel_size=(3, 3), stride=(1, 1)) (relu): ReLU() ) ``` After prepare: ``` GraphModule( (x_activation_post_process_0): MinMaxObserver(min_val=5.960464477539063e-08, max_val=0.9999999403953552) (x_activation_post_process_0_equalization_process_0): _InputEqualizationObserver( (input_obs): PerChannelMinMaxObserver(min_val=tensor([1.1921e-07, 3.3379e-06, 5.9605e-08]), max_val=tensor([1.0000, 1.0000, 1.0000])) ) (fc): ConvReLU2d( (0): Conv2d(3, 5, kernel_size=(3, 3), stride=(1, 1)) (1): ReLU() ) (fc_activation_post_process_0): MinMaxObserver(min_val=0.0, max_val=1.2341605424880981) ) graph(): %x : [#users=1] = placeholder[target=x] %x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_0](args = (%x,), kwargs = {}) %x_activation_post_process_0_equalization_process_0 : [#users=1] = call_module[target=x_activation_post_process_0_equalization_process_0](args = (%x_activation_post_process_0,), kwargs = {}) %fc : [#users=1] = call_module[target=fc](args = (%x_activation_post_process_0_equalization_process_0,), kwargs = {}) %fc_activation_post_process_0 : [#users=1] = call_module[target=fc_activation_post_process_0](args = (%fc,), kwargs = {}) return fc_activation_post_process_0 ``` After equalization functions: ``` graph(): %x : [#users=1] = placeholder[target=x] %x_equalization_scale0 : [#users=1] = get_attr[target=x_equalization_scale0] %mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_equalization_scale0), kwargs = {}) %x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_0](args = (%mul,), kwargs = {}) %fc : [#users=1] = call_module[target=fc](args = (%x_activation_post_process_0,), kwargs = {}) %fc_activation_post_process_0 : [#users=1] = call_module[target=fc_activation_post_process_0](args = (%fc,), kwargs = {}) return fc_activation_post_process_0 ``` After convert: ``` graph(): %x : [#users=1] = placeholder[target=x] %x_equalization_scale0 : [#users=1] = get_attr[target=x_equalization_scale0] %mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_equalization_scale0), kwargs = {}) %fc_input_scale_0 : [#users=1] = get_attr[target=fc_input_scale_0] %fc_input_zero_point_0 : [#users=1] = get_attr[target=fc_input_zero_point_0] %quantize_per_tensor : [#users=1] = call_function[target=torch.quantize_per_tensor](args = (%mul, %fc_input_scale_0, %fc_input_zero_point_0, torch.quint8), kwargs = {}) %fc : [#users=1] = call_module[target=fc](args = (%quantize_per_tensor,), kwargs = {}) %dequantize : [#users=1] = call_method[target=dequantize](args = (%fc,), kwargs = {}) return dequantize ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D29638275 fbshipit-source-id: 40d4666a4451e132612ea38fdfeaaec177a1defb	2021-07-13 14:00:40 -07:00
Angela Yi	77d36b657a	[quant] Input-Weight Equalization - Conv prepare support (#61286 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61286 Modifies the prepare step to support conv layers during input-weight equalization and adds tests to make sure that the results are as expected. Initial: ``` w \| x -> conv -> y ``` After prepare: ``` w \| weight_quant_obs \| weight_eq_obs \| x -> input_quant_obs -> input_eq_obs -> conv -> out_quant_obs -> y ``` Test Plan: `python test/test_quantization.py TestEqualizeFx.test_input_weight_equalization_prepare` Initial: ``` ConvModel( (conv): Conv2d(3, 5, kernel_size=(3, 3), stride=(1, 1), bias=False) ) ``` After prepare: ``` graph(): %x : [#users=1] = placeholder[target=x] %x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_0](args = (%x,), kwargs = {}) %x_activation_post_process_0_equalization_process_0 : [#users=1] = call_module[target=x_activation_post_process_0_equalization_process_0](args = (%x_activation_post_process_0,), kwargs = {}) %conv : [#users=1] = call_module[target=conv](args = (%x_activation_post_process_0_equalization_process_0,), kwargs = {}) %conv_activation_post_process_0 : [#users=1] = call_module[target=conv_activation_post_process_0](args = (%conv,), kwargs = {}) return conv_activation_post_process_0 ``` Imported from OSS Reviewed By: supriyar Differential Revision: D29557051 fbshipit-source-id: 25d1531645dfaf565f5c615e2ee850fcf96c7eb9	2021-07-13 14:00:36 -07:00
Pavithran Ramachandran	9ef1c64907	[PyTorch][Edge] Tests for QuantizationFx API on lite modules (#60476 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60476 # Context Add tests for Lite modules that are quantized using fx API Read this posts for details about why we need a test bench for quantized lite modules https://fb.workplace.com/groups/2322282031156145/permalink/4289792691071726/ https://github.com/pytorch/pytorch/pull/60226#discussion_r654615851 moved common code to `caffe2/torch/testing/_internal/common_quantization.py` ghstack-source-id: 133144292 Test Plan: ``` ~/fbsource/fbcode] buck test caffe2/test:fx_quantization_lite Downloaded 0/2 artifacts, 0.00 bytes, 100.0% cache miss Building: finished in 8.3 sec (100%) 11892/11892 jobs, 2 updated Total time: 8.6 sec More details at https://www.internalfb.com/intern/buck/build/ffb7d517-d85e-4c8f-9531-5e5d9ca1d34c Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details. Running with tpx session id: d79a5713-bd29-4bbf-ae76-33a413869a09 Trace available for this run at /tmp/tpx-20210630-105547.675980/trace.log Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/3096224749578707 ✓ ListingSuccess: caffe2/test:fx_quantization_lite - main (9.423) ✓ Pass: caffe2/test:fx_quantization_lite - test_embedding (mobile.test_quantize_fx_lite_script_module.TestFuseFx) (10.630) ✓ Pass: caffe2/test:fx_quantization_lite - test_submodule (mobile.test_quantize_fx_lite_script_module.TestFuseFx) (12.464) ✓ Pass: caffe2/test:fx_quantization_lite - test_conv2d (mobile.test_quantize_fx_lite_script_module.TestFuseFx) (12.728) Summary Pass: 3 ListingSuccess: 1 If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users Finished test run: https://www.internalfb.com/intern/testinfra/testrun/3096224749578707 ``` Reviewed By: iseeyuan Differential Revision: D29306402 fbshipit-source-id: aa481e0f696b7e9b04b9dcc6516e8a390f7dc1be	2021-07-08 10:40:08 -07:00
Angela Yi	9b94aa5356	[quant][fx][fix] Fused modules with object_type in qconfig (#60779 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60779 When we do fusion, we replace certain modules (such as Linear + ReLU) with fused versions (such as LinearReLU) by calling `_fuse_fx` in prepare_fx. However when we try to look up using the fused module type in qconfig_dict, we cannot find a match anymore since the qconfig dict contains the original module types. An example is here [N882873](https://fburl.com/anp/azenjx3v). So we will now update the qconfig_dict to include the fused modules mapping to the qconfigs used for the modules that make up the fused modules. If the modules are not mapped to the same qconfig, then we will raise an error. Test Plan: `python test/test_quantization.py TestFuseFx.test_qconfig_fused_module` Imported from OSS Reviewed By: supriyar Differential Revision: D29406941 fbshipit-source-id: 74b5db89f4998aeb02b2bf7c37bf97326580c654	2021-06-28 15:22:22 -07:00
Angela Yi	da70dd199d	[quant] Input-Weight Equalization - tests (#60378 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60378 Created the following unit-tests to check that our equalization algorithm is as expected: - Check the equalization scales calculated and stored in the graph are as expected - Check the scaled weights and biases are as expected - Check that the min/max values in the quantization observers are as expected - Check that the graphs with equalization are structured in the same way as graphs without equalization (except that equalized graphs have additional equalization scale and mul nodes) before and after quantization Test Plan: `python test/test_quantization TestEqualizeFx.test_input_weight_equalization_equalization_scales` `python test/test_quantization TestEqualizeFx.test_input_weight_equalization_weights_bias` `python test/test_quantization TestEqualizeFx.test_input_activation_values` `python test/test_quantization TestEqualizeFx.test_input_weight_equalization_graphs` Imported from OSS Reviewed By: supriyar Differential Revision: D29406942 fbshipit-source-id: 518208546ae5835c1ebb2af217507e90af66fbe4	2021-06-28 10:44:29 -07:00
Denis Kokarev	087ac75b26	Fix quantized mean operator in QNNPACK backend (#59761 ) Summary: cc: kimishpatel Fixes https://github.com/pytorch/pytorch/issues/58668 Test it with `pytest -k test_quantized_mean test/test_quantization.py` or `buck test //caffe2/test:quantization -- test_quantized_mean` Pull Request resolved: https://github.com/pytorch/pytorch/pull/59761 Reviewed By: bdhirsh Differential Revision: D29013271 Pulled By: kimishpatel fbshipit-source-id: 020956fb63bd5078856ca17b137be016d3fc29b8	2021-06-14 17:30:21 -07:00
Joel Schlosser	7d2a9f2dc9	Fix instance norm input size validation + test (#56659 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/45687 Fix changes the input size check for `InstanceNorm*d` to be more restrictive and correctly reject sizes with only a single spatial element, regardless of batch size, to avoid infinite variance. Pull Request resolved: https://github.com/pytorch/pytorch/pull/56659 Reviewed By: pbelevich Differential Revision: D27948060 Pulled By: jbschlosser fbshipit-source-id: 21cfea391a609c0774568b89fd241efea72516bb	2021-04-23 10:53:39 -07:00
Jerry Zhang	98933866a9	[quant][graphmode][fx] Optimize cat (#54813 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54813 Previously we have a cat that takes a list of Tensors with different qparams and dequantize them cacatenate them and requantize with the output qparams. This adds some unnecessary overhead in dequantizing and quantizing Tensors. This PR adds an optimization for cat operator, we'll make sure inputs and output of cat uses same observer/fake_quant and produce a cat that does not do rescaling. Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D27408377 fbshipit-source-id: 6a4bdcfd15e57ea1fe0f7e72d1e1288eb3ece4db	2021-04-16 16:00:43 -07:00
Vasiliy Kuznetsov	0fbc2be234	ns for fx: enable `call_method` nodes in graph matching (#56194 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56194 Enables the NS graph matcher to also match `call_method` nodes. These are useful for ops such as `torch.sigmoid`. Test Plan: ``` python test/test_quantization.py TestFXGraphMatcher.test_methods ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D27805333 fbshipit-source-id: 509ae283db6b245671f11e3eb6b7fcb3a5735ef5	2021-04-16 10:34:41 -07:00
Vasiliy Kuznetsov	8b992ab0e4	ns for fx: add conv1d weight extraction (#55327 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55327 Adds NS functionality for extracting weights from `F.conv1d` nodes. Test Plan: ``` python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_extract_weights_conv_fun ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D27575425 fbshipit-source-id: 65fa194802ac7a9fb75b7616d962c5c2e71321ff	2021-04-14 09:04:30 -07:00
Vasiliy Kuznetsov	454832e5fa	ns for fx: create subgraph type (#54253 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54253 Creates an `NSSubgraph` type for representing a subgraph instance, and modifies the NS code to use it. This will enable us to add more information to the subgraph instance definition without having to change all the callsites. Test Plan: ``` mypy torch/quantization python test/test_quantization.py TestFXGraphMatcher python test/test_quantization.py TestFXNumericSuiteCoreAPIs ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D27158198 fbshipit-source-id: 548785dd90144e2da256c23af990620c778e7cfe	2021-03-25 22:35:34 -07:00
Vasiliy Kuznetsov	52a8075f16	ns for fx: add support for lstm activation matching (#53779 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53779 Moves the test case for LSTM activation matching to new NS APIs. This requires adding the ability to log non-Tensor types. Since we need Loggers to be scriptable and TorchScript does not support `Union`, we collect statistics in a separate collector if we have an RNN. Note: this can scale to a small N of return types, but not to a large N. If the N becomes large in the future, we will solve it then. Test Plan: ``` python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels ``` Imported from OSS Reviewed By: hx89 Differential Revision: D26967110 fbshipit-source-id: afe60b44fdec28a328813b4f342cf4fe04820baa	2021-03-25 22:33:41 -07:00
Vasiliy Kuznetsov	421e91dfd2	ns for fx: add support for logging inputs Summary: This PR implements the option to log inputs for FX Numeric Suite. The user facing api looks like ``` def prepare_model_outputs(..., should_log_inputs : bool = False) def prepare_model_with_stubs(..., should_log_inputs : bool = False) ``` The output data now looks like ``` { "layer1": { "node_inputs": { "model1": [{ "values": ..., ..., }], }, "node_outputs": { ..., } }, ... // other layers } ``` One key design decision taken here is that an input logger logs the output of previous nodes, instead of logging the input of the current node. This matters for a signature such as `cat([x1, x2, x3])`. We are inserting three input loggers here (for x1, x2, and x3), instead of a single input logger for `[x1, x2, x3]`. This was chosen in order to preserve the structure of the original graph as much as possible and keep flexibility for future optimizations. Test Plan: TODO: fill out Imported from OSS Differential Revision: D26931225 Reviewed By: hx89 Pulled By: vkuzo fbshipit-source-id: dd692bfb5ddaaf5554f80c25e2f40b21762e4fc3	2021-03-12 10:02:17 -08:00
Jerry Zhang	096bea5251	[reland][quant][graphmode][fx][fp16] Add fp16 support for {add\|mul}{_relu} (#52714 ) (#53019 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53019 Test Plan: python test/test_quantization.py TestQuantizedOps.test_add python test/test_quantization.py TestQuantizedOps.test_mul python test/test_quantization.py TestQuantizedOps.test_add_relu python test/test_quantization.py TestQuantizedOps.test_mul_relu Imported from OSS Imported from OSS Reviewed By: vkuzo Differential Revision: D26725350 fbshipit-source-id: 2a89f5da6a21908f454f870521d2a4549fdd291e	2021-03-01 13:19:42 -08:00
Mike Ruberry	312b297b82	Revert D26626092: [quant][graphmode][fx][fp16] Add fp16 support for {add\|mul}{_relu} Test Plan: revert-hammer Differential Revision: D26626092 (`2962fbb03c`) Original commit changeset: 91d040efa51e fbshipit-source-id: cc6bcc0f451d6adcd7bf7572451e6e3cd6ad59d1	2021-03-01 04:52:47 -08:00
Jerry Zhang	2962fbb03c	[quant][graphmode][fx][fp16] Add fp16 support for {add\|mul}{_relu} (#52714 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52714 Test Plan: python test/test_quantization.py TestQuantizedOps.test_add python test/test_quantization.py TestQuantizedOps.test_mul python test/test_quantization.py TestQuantizedOps.test_add_relu python test/test_quantization.py TestQuantizedOps.test_mul_relu Imported from OSS Reviewed By: vkuzo Differential Revision: D26626092 fbshipit-source-id: 91d040efa51e9c955eb688ec16a30f0c12233958	2021-02-27 22:12:10 -08:00
Jerry Zhang	177694681e	[quant][graphmode][fx] Add reference option support for linear_dynamic_fp16 (#52534 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52534 Currently linear_dynamic_fp16 has a signature that's tied to fbgemm/qnnpack We'll need to produce a pattern equivalent to linear_dynamic_fp16 to support extensions to other backends Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_linear_dynamic_fp16 Imported from OSS Reviewed By: vkuzo Differential Revision: D26557726 fbshipit-source-id: 270c9f781f73c79416a092b7831294cabca84b0c	2021-02-26 21:12:22 -08:00
Vasiliy Kuznetsov	d2e88246d8	ns for fx: make return type of ns APIs future proof (#52789 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52789 Changes the return type of NS APIs from ``` { layer_name: { model_name: [torch.Tensor(...), ...], }, } ``` to ``` { layer_name: { model_name: { 'type': 'weight', # or node_output, etc 'values': [torch.Tensor(...), ...], // future info can be added here, such as node name, etc }, } ``` Test Plan: ``` python test/test_quantization.py TestFXNumericSuiteCoreAPIs ``` Imported from OSS Reviewed By: hx89 Differential Revision: D26652640 fbshipit-source-id: 4b31164e402754141368d5a04d595f2b643af3bb	2021-02-25 20:45:44 -08:00
Vasiliy Kuznetsov	fe068157de	ns for fx: unify return types of weight and activation APIs (#52779 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52779 1. makes the return type of the weight comparison APIs match the return type of the activation comparison APIs: ``` # before {layer_name: {model_name: weight_tensor}} {layer_name: {model_name: [activation_tensor]}} # after {layer_name: {model_name: [weight_tensor]}} {layer_name: {model_name: [activation_tensor]}} ``` 2. makes a type alias for the type, so future changes are easier Test Plan: ``` mypy torch/quantization python test/test_quantization.py TestFXNumericSuiteCoreAPIs ``` Imported from OSS Reviewed By: hx89 Differential Revision: D26652639 fbshipit-source-id: eb1f04d6913cedf88d628f362468875ae9ced928	2021-02-25 20:45:39 -08:00
Jerry Zhang	626756ac39	[quant][graphmode][api] debug --> reference (#52179 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52179 Rename debug to reference. We'll use this to produce a reference quantized model that can be used as a common interface between pytorch quantized model and backends. Test Plan: python test/test_quantization.py TestQuantizeFx Imported from OSS Reviewed By: vkuzo Differential Revision: D26424656 fbshipit-source-id: a0299b023f6ba7d98f5750724c517b0ecb987b35	2021-02-19 14:20:01 -08:00
Vasiliy Kuznetsov	d903106bad	[wip] ns for fx: add support for subgraph matching (#52130 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52130 We have patterns like (F.linear, F.relu) which need to match to (toq.linear_relu). So, we need to match subgraphs. This PR does the following: * defines a "subgraph" as (start_node, end_node). The current assumption is that subgraphs are simple, there is always a path from start_node to end_node, and we can ignore any non-input args/kwargs of these nodes for the purposes of matching and copying things. An example one node subgraph is (F.linear, F.linear). An example two node subgraph is (F.linear, F.relu). * changes the matching logic to iterate over subgraphs instead of nodes * changes the NS core APIs to use subgraph pairs instead of node pairs: 1. for weights, we match on the start node 2. for unshadowed activations, we observe the end nodes 3. for shadowed activations, we copy the subgraph of a to graph c TODO(before review) write up better, not ready for review yet Test Plan: TODO before land: better test plan Imported from OSS Reviewed By: raghuramank100 Differential Revision: D26403092 fbshipit-source-id: e49aaad4b02b8d60589435848bee422b8f41937a	2021-02-18 08:20:04 -08:00

1 2 3 4

169 Commits