pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Jerry Zhang	c98896b76f	[quant][pt2e] Add more precise representation for quantized add (#104130 ) Summary: The planned e2e for quantization in pytorch 2.0 export is the following: float_model -> prepare_pt2e -> calibration -> convert_pt2e -> ... inside convert_pt2e, we will first produce a q/dq representation of the quantized model, similar to the previous output of convert_to_reference_fx in fx grah mode quantization: ``` torch.ops.quantized_decomposed.dequantize_per_tensor -> torch.ops.aten.add -> torch.ops.quantized_decomopsed.quantize_per_tensor torch.ops.quantized_decomposed.dequantize_per_tensor / ``` Then we'll rewrite the above to a more precise representation that express the intention in a more precise manner, since here we actually want to do int8 addition, instead of simulating the int8 addition with fp32 operations, the representation for quantized add is: ``` def quantized_add(x_i8, x_scale, x_zero_point, y_i8, y_scale, y_zero_point, out_scale, out_zero_point): x = (x_scale / out_scale) * x_i8 y = (y_scale / out_scale) * y_i8 out = x + y out -= (x_zero_point * x_scale - y_zero_point * y_scale) / out_scale out += out_zero_point return out ``` Test Plan: ``` buck2 test caffe2/test:quantization_pt2e -- --exact 'caffe2/test:quantization_pt2e - test_representation_add (quantization.pt2e.test_quantize_pt2e.TestQuantizePT2E)' ``` Reviewed By: kimishpatel Differential Revision: D45628032 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104130 Approved by: https://github.com/kimishpatel	2023-06-27 20:11:30 +00:00
Jerry Zhang	ce8d31551b	[quant][be] Change return type for zero_point to be int32 Tensor (#102234 ) Summary: This is probably a typo Test Plan: CI Reviewed By: salilsdesai Differential Revision: D46172706 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102234 Approved by: https://github.com/salilsdesai	2023-06-01 18:30:44 +00:00
Kazuaki Ishizaki	a13a63ae9a	Fix typos under torch/ao directory (#97679 ) This PR fixes typos in comments and messages of `.py` files under `torch/ao` directory Pull Request resolved: https://github.com/pytorch/pytorch/pull/97679 Approved by: https://github.com/janeyx99, https://github.com/kit1980	2023-04-10 22:25:15 +00:00
Jacob Szwejbka	fc324d3485	[quant][pt2e] Add support for dynamic quantization with symmetric quant for input (#94854 ) Summary: Previously we assumed asymmetric quantization for dynamic quantization, this diff adds the support of symmetric quantization for the input in dynamic quantization Test Plan: buck run executorch/exir/tests:quant_lowering_custom_backend_pass -- "executorch.exir.tests.test_quant_lowering_custom_backend_pass.TestQuantLoweringCustomBackendPass.test_quantized_linear_dynamic" Reviewed By: digantdesai Differential Revision: D43134794 Pull Request resolved: https://github.com/pytorch/pytorch/pull/94854 Approved by: https://github.com/digantdesai	2023-02-28 19:39:31 +00:00
PyTorch MergeBot	641dc0b844	Revert "[quant] Add quantize and dequantize operators to decomposition table (#93312 )" This reverts commit 782e4f5c02abaf5b9cdba4eaa827bc70a310bca8. Reverted https://github.com/pytorch/pytorch/pull/93312 on behalf of https://github.com/jeanschmidt due to this commits breaks internal builds: https://fburl.com/sandcastle/dw0rqcbv	2023-02-13 09:20:37 +00:00
Jacob Szwejbka	2628901033	[Executorch][Quant] Add Choose_qparams_symmetric (#94685 ) Summary: needed for symmetric dynamic quant flow Test Plan: todo Reviewed By: jerryzh168 Differential Revision: D43134117 Pull Request resolved: https://github.com/pytorch/pytorch/pull/94685 Approved by: https://github.com/larryliu0820	2023-02-13 07:27:48 +00:00
Jerry Zhang	782e4f5c02	[quant] Add quantize and dequantize operators to decomposition table (#93312 ) Summary: This PR tries to decompose the operators in torch.ops.quantized_decomposed namespace to more primitive aten operators, this would free us from maintaining the semantics of the quantize/dequantize operators, which can be expressed more precises in terms of underlying aten operators Note: this PR just adds them to the decomposition table, we haven't enable this by default yet Test Plan: python test/test_quantization.py TestQuantizePT2E.test_q_dq_decomposition Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/93312 Approved by: https://github.com/vkuzo, https://github.com/SherlockNoMad	2023-02-10 01:40:12 +00:00
Jacob Szwejbka	bb48d90b00	[Executorch][Quant][BE] Refactor Choose_Qparams (#94338 ) Summary: Refactor so that it can be decomposed Test Plan: ci Differential Revision: D42681268 Pull Request resolved: https://github.com/pytorch/pytorch/pull/94338 Approved by: https://github.com/jerryzh168	2023-02-09 01:20:17 +00:00
PyTorch MergeBot	3a5a762443	Revert "[quant] Add quantize and dequantize operators to decomposition table (#93312 )" This reverts commit 3fd46a2f9c56c692b242727cb146cfd464210c6a. Reverted https://github.com/pytorch/pytorch/pull/93312 on behalf of https://github.com/huydhn due to Sorry for reverting your PR, but it breaks trunk due to a landrace `3fd46a2f9c`. Please rebase and re-land it	2023-02-08 18:29:10 +00:00
Jerry Zhang	3fd46a2f9c	[quant] Add quantize and dequantize operators to decomposition table (#93312 ) Summary: This PR tries to decompose the operators in torch.ops.quantized_decomposed namespace to more primitive aten operators, this would free us from maintaining the semantics of the quantize/dequantize operators, which can be expressed more precises in terms of underlying aten operators Note: this PR just adds them to the decomposition table, we haven't enable this by default yet Test Plan: python test/test_quantization.py TestQuantizePT2E.test_q_dq_decomposition Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/93312 Approved by: https://github.com/vkuzo, https://github.com/SherlockNoMad	2023-02-08 17:26:01 +00:00
Nikita Shulga	c0dd9b3b67	Revert "[Executorch][Quantization][BE] Refactor Choose Qparams (#92592 )" This reverts commit 59071ab1e71891d480ab77af0d619bc5e01094c2. It breaks `quantization.jit.test_ondevice_quantization.TestOnDeviceDynamicPTQFinalize`, which is not run in OSS, but is mandatory for internal CI.	2023-01-23 09:13:02 -08:00
Jacob Szwejbka	59071ab1e7	[Executorch][Quantization][BE] Refactor Choose Qparams (#92592 ) Summary: Should hopefully be a little faster. Definitely cleaner to not create an observer inside the op Test Plan: ci Differential Revision: D42154677 Pull Request resolved: https://github.com/pytorch/pytorch/pull/92592 Approved by: https://github.com/jerryzh168	2023-01-20 01:36:47 +00:00
Jerry Zhang	2a23dfe8ed	[quant] Support lowering for quantized embedding byte operator (#91159 ) Summary: This PR adds lowering for embedding in quantization in executorch flow Test Plan: buck run executorch/exir/tests:quant_fusion_pass -- "executorch.exir.tests.test_quant_fusion_pass.TestQuantFusionPass.test_embedding_byte" Reviewed By: qihqi Differential Revision: D41673139 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91159 Approved by: https://github.com/vkuzo	2022-12-21 22:52:24 +00:00
Jacob Szwejbka	bd94ee66ea	[quantized] [executorch] typo (#89960 ) Summary: Inefficient impl in python Test Plan: buck2 test mode/dev //caffe2/test/quantization:test_quantization -- --exact 'caffe2/test/quantization:test_quantization - test_quantized_embedding_byte (caffe2.test.quantization.core.test_quantized_tensor.TestQuantizedTensor)' Differential Revision: D41627744 Pull Request resolved: https://github.com/pytorch/pytorch/pull/89960 Approved by: https://github.com/jerryzh168	2022-12-16 19:49:09 +00:00
Jerry Zhang	94b9bb324f	[quant] Add example for lowering quantized dynamic linear pattern through delegation (#90640 ) Summary: Only the pattern part, will leave the delegation example to Chen Test Plan: buck run executorch/exir/tests:quant_lowering_custom_backend_pass -- "executorch.exir.tests.test_quant_lowering_custom_backend_pass.TestQuantLoweringCustomBackendPass.test_quantized_linear_dynamic" Reviewed By: cccclai Pull Request resolved: https://github.com/pytorch/pytorch/pull/90640 Approved by: https://github.com/cccclai	2022-12-13 00:57:33 +00:00
Edward Z. Yang	a747326423	Add manual meta implementations to quantize_per_tensor.tensor and co (#89958 ) When you are writing a meta function, you cannot call item() on the tensor because there is no real data on the tensor and it will fail. The error message was not very good in this case, see also https://github.com/pytorch/pytorch/issues/89959 This PR takes a brute force approach to resolving the problem: just manually define meta implementations for the naughty functions that are calling item(). However, this results in a lot of code duplication. The easiest way to avoid this situation is to rewrite the decomps so they don't call item. It should not be that difficult to use direct tensors on your operations, as scalar tensors can broadcast too. I could only test this with `buck test @mode/opt -c python.package_style=inplace //executorch/backends/test:test_backends` in internal with D41555454. Test coverage needs to be improved, otherwise don't blame us when we break you. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/89958 Approved by: https://github.com/jerryzh168	2022-12-01 06:04:37 +00:00
Jerry Zhang	9e4a25c731	[quant][decomposed] Add support for int32 for decomposed q/dq ops (#89881 ) Summary: att Test Plan: python test/test_quantization.py -k test_decomposed_quantize_per_tensor python test/test_qunatization.py -k test_decomposed_dequantize_per_tensor Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/89881 Approved by: https://github.com/cccclai	2022-11-30 21:24:00 +00:00
Jerry Zhang	b7483be06a	[quant][docs] Add docstrings for operators defined in torch.ops.quantized_decomposed namespace (#89547 ) Summary: no functionality changes Test Plan: NA Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/89547 Approved by: https://github.com/vkuzo	2022-11-23 20:40:53 +00:00
Jerry Zhang	29742786f3	[quant] Add dequantize_per_channel in quantized_decomposed op library (#89269 ) Summary: att Test Plan: python test/test_quantization.py -k test_decomposed_dequantize_per_channel Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/89269 Approved by: https://github.com/vkuzo	2022-11-23 04:25:25 +00:00
Jerry Zhang	391b593ca2	[quant] Add quantize_per_channel in quantized_decomposed op library (#89268 ) Summary: att Test Plan: python test/test_quantization.py -k test_decomposed_quantize_per_channel Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/89268 Approved by: https://github.com/vkuzo	2022-11-22 22:40:11 +00:00
Jacob Szwejbka	6f4f69f54d	[Executorch] [Quantization] New pattern for dynamic dequant (#89236 ) Summary: The op exposed should be qparams, and then we have concerns about prims not being supported so make q and dq ops that take in tensors Test Plan: unit test Differential Revision: D41382580 Pull Request resolved: https://github.com/pytorch/pytorch/pull/89236 Approved by: https://github.com/jerryzh168	2022-11-18 04:13:05 +00:00
Jacob Szwejbka	7f55db4fb0	add quantize_decomposed_dynamic to op lib (#88855 ) Summary: Needed for dynamic quant reference pattern graphs. Test Plan: added unittest Differential Revision: D41205030 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88855 Approved by: https://github.com/jerryzh168	2022-11-16 16:59:36 +00:00
Jerry Zhang	7ab6f56ca7	[quant][core] Add quantize/dequantize ops for decomposed quantized Tensor representation (#87093 ) Summary: Added q/dq implementation for out of core (decomposed) quantized Tensor representation, meaning that instead of storing quantization parameters (e.g. scale/zero_point) in a separate quantized Tensor object, we will store quantization parameters in the argument of operators. ``` quantize(float32_tensor, scale, zero_point, dtype) -> int8_tensor dequantize(int8_tensor, scale, zero_point, dtype) -> float32_tensor ``` Test Plan: python test/test_quantization.py TestQuantizedTensor.test_decomposed_quantize python test/test_quantization.py TestQuantizedTensor.test_decomposed_dequantize Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/87093 Approved by: https://github.com/dzdang, https://github.com/z-a-f	2022-10-25 23:50:41 +00:00

23 Commits