pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-11-06 17:24:59 +08:00

Author	SHA1	Message	Date
Jerry Zhang	1b51d29b66	[quant][pt2e] Enable constant folding for quantize ops (#109343 ) Summary: This PR added constant folding for quantize ops so that instead of storing fp32 weight in the quantized model, we'll get int8/int16 etc. weight Test Plan: python test/test_quantization.py TestQuantizePT2E.test_fold_quantize also will verify in executorch later Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D49399210](https://our.internmc.facebook.com/intern/diff/D49399210) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109343 Approved by: https://github.com/kimishpatel, https://github.com/jgong5	2023-09-27 06:04:45 +00:00
Kimish Patel	eb67c452c8	[Quant] Add DQ duplication pass (#107900 ) Summary: During convert step observers are first replaced by Q-DQ pair. In some scenarios like following output DQ has a fan out. ---> OP2 -> Q -> DQ / OP -> Q -> DQ - \ ---> OP3 -> Q -> DQ If either op OP2 or OP3 are configured to be quantized, then the input is expected to quantized. In this case quantized equivalent of some pattern, that quantizer asked to be quantized, should look like: [DQ -> {pattern} -> Q]. However, in scenario like above where DQ node is shared between multiple "quantized" patterns, boundary of "quantized" pattern is not clear because DQ now belongs to multiple quantized patterns. This poses challenge for: - Porting metadata: which "quantized" partition this DQ node belongs - Quantized representation, equivalently, needs to identify self-contained quantized pattern that is replaced by its equivalent pattern that captures compute in the quantized precision. Test Plan: test_duplicate_dq_pass Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48663147](https://our.internmc.facebook.com/intern/diff/D48663147) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107900 Approved by: https://github.com/jerryzh168, https://github.com/andrewor14, https://github.com/leslie-fang-intel ghstack dependencies: #107105, #107106, #107899	2023-09-02 06:20:03 +00:00
leslie-fang-intel	fb808c30c7	x86_inductor_quantizer switches to new graph capture API (#108214 ) Summary Update `X86InductorQuantizer` and related testcase to the new graph capture API `capture_pre_autograd_graph`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108214 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-09-01 00:43:45 +00:00
Jerry Zhang	9ae3d7ca90	[reland][quant][pt2e][xnnpack_quantizer] Add support for mul and mul_relu (#107930 ) (#107992 ) Summary: att Test Plan: buck2 run executorch/examples/quantization:example -- -m=mv3 --verify Differential Revision: D48588121 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107992 Approved by: https://github.com/digantdesai, https://github.com/mcr229	2023-08-27 14:50:03 +00:00
Xia, Weiwen	e9b0f62a19	[Quant][PT2E] Enable linear and linear-unary post-op quant recipe for x86 inductor quantizer (#106781 ) Summary Add linear and linear-unary post-op quantization recipe to x86 inductor quantizer. For PT2E with Inductor. With this, the quantization path will add `quant-dequant` pattern for linear and linear-unary post op. Test plan python test/test_quantization.py -k test_linear_with_quantizer_api python test/test_quantization.py -k test_linear_unary_with_quantizer_api Pull Request resolved: https://github.com/pytorch/pytorch/pull/106781 Approved by: https://github.com/leslie-fang-intel, https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #105818	2023-08-27 10:50:17 +00:00
leslie-fang-intel	1147a28b0b	[Quant][PT2E] Add cat and avg_pool2d recipe into x86InductorQuantizer (#106836 ) Summary Add `cat` and `avg_pool2d` quantization recipe as input output share observer into `x86InductorQuantizer`. Test Plan ``` clear && python -m pytest test_x86inductor_quantizer.py -k test_cat_recipe clear && python -m pytest test_x86inductor_quantizer.py -k test_cat_recipe_same_inputs clear && python -m pytest test_x86inductor_quantizer.py -k test_cat_recipe_single_input clear && python -m pytest test_x86inductor_quantizer.py -k test_avg_pool2d_recipe ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/106836 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-08-26 16:51:13 +00:00
leslie-fang-intel	70ca18f8a0	[Quant][PT2E] Enable X86InductorQuantizer single quantizable op(maxpool2d) (#105639 ) Summary In this PR, we mainly enable 2 things. - Enable the skeleton of quantization recipe for single quantizable operators in `X86InductorQuantizer`. - Add quantization recipe of `maxpool2d` and annotate it as input./output share observer. Test Plan ``` python -m pytest test_x86inductor_quantizer.py -k test_maxpool2d_recipe ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105639 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #104580, #104581, #104588, #104590, #105455, #105456	2023-08-26 08:34:15 +00:00
PyTorch MergeBot	8d44b0f5a5	Revert "[quant][pt2e][xnnpack_quantizer] Add support for mul and mul_relu (#107930 )" This reverts commit 1d1739dc6d7365c28719cd0175081f9d9aab0324. Reverted https://github.com/pytorch/pytorch/pull/107930 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/107930#issuecomment-1694069330))	2023-08-26 00:37:02 +00:00
Jerry Zhang	1d1739dc6d	[quant][pt2e][xnnpack_quantizer] Add support for mul and mul_relu (#107930 ) Summary: att Test Plan: buck2 run executorch/examples/quantization:example -- -m=mv3 --verify Differential Revision: D48588121 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107930 Approved by: https://github.com/kimishpatel	2023-08-25 23:36:19 +00:00
Jerry Zhang	28be2c674a	[quant][pt2e] Move specific quantizer related things outside of main quant code base (#106806 ) (#107259 ) Summary: Currently in quantizer/quantize_pt2e we import things from specific quantizers (XNNPACKQuantizer, QuantizationConfig) etc. this PR removes them so it's clearer that they are not part of the core quantization code base This PR also removed get_supported_operators from main Quantizer since we haven't seen a clear need for this API Test Plan: CIs Imported from OSS Differential Revision: D48340367 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107259 Approved by: https://github.com/kimishpatel	2023-08-18 21:29:09 +00:00
leslie-fang-intel	bfed2da2e4	[Quant][PT2E] Re-enable test case of conv add/add_relu recipe for x86inductorquantizer (#105638 ) Summary Re-enable the test case of `test_conv2d_binary_with_quantizer_api` and `test_conv2d_binary_unary_with_quantizer_api` for X86InductorQuantizer. We disable these 2 testcases previously due to the time out issue in internal CI. Test Plan ``` python -m pytest test_x86inductor_quantizer.py -k test_conv2d_binary_with_quantizer_api python -m pytest test_x86inductor_quantizer.py -k test_conv2d_binary_unary_with_quantizer_api ``` Differential Revision: [D47745372](https://our.internmc.facebook.com/intern/diff/D47745372) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105638 Approved by: https://github.com/jerryzh168, https://github.com/andrewor14	2023-08-02 17:26:22 +00:00
PyTorch MergeBot	93b2036bef	Revert "[quant][pt2e] store scale/zero_point as tensor attributes to support serialization (#105894 )" This reverts commit 3ca71ed735257cb7ad377b57a45057c265893a40. Reverted https://github.com/pytorch/pytorch/pull/105894 on behalf of https://github.com/huydhn due to breaking executorch tests internally ([comment](https://github.com/pytorch/pytorch/pull/105894#issuecomment-1654831950))	2023-07-28 01:16:02 +00:00
Jerry Zhang	3ca71ed735	[quant][pt2e] store scale/zero_point as tensor attributes to support serialization (#105894 ) Summary: Currently scale/zero_point for per tensor quant is stored as burnt in literals, this means these values can't be serialized in state_dict, this PR changes them to buffers/Tensors so that they can be serialized Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D47770963](https://our.internmc.facebook.com/intern/diff/D47770963) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105894 Approved by: https://github.com/kimishpatel	2023-07-26 20:15:06 +00:00
Jerry Zhang	3a77f9aaaf	[quant][api] Move torch.ao.quantization.pt2e.quantizer to torch.ao.quantization.quantizer (#105885 ) Summary: moving quantizer to torch.ao.quantization to make it a public api, since pt2e is a folder for implementations Test Plan: CIs sanity check: "buck test //executorch/backends/xnnpack/test:test_xnnpack_quantized_models -- test_resnet18" Differential Revision: D47727838 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105885 Approved by: https://github.com/andrewor14	2023-07-26 18:20:09 +00:00
Jerry Zhang	554052f321	[quant][pt2e][be] Rename prepare_pt2e_quantizer to prepare_pt2e (#105484 ) Summary: att Test Plan: sandcastle and OSS CI Reviewed By: andrewor14 Differential Revision: D47422892 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105484 Approved by: https://github.com/andrewor14	2023-07-19 04:51:37 +00:00
Jerry Zhang	ed2b9f1af1	[quant][pt2e] rename _quantize_pt2e to quantize_pt2e (#105377 ) Summary: att Test Plan: CIs Reviewed By: andrewor14 Differential Revision: D47234357 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105377 Approved by: https://github.com/andrewor14	2023-07-18 16:46:05 +00:00
Jerry Zhang	7b4d080496	[quant][pt2e] Rename _pt2e to pt2e (#104668 ) Summary: X-link: https://github.com/pytorch/executorch/pull/3 att Test Plan: Imported from OSS Differential Revision: D47202807 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104668 Approved by: https://github.com/andrewor14	2023-07-15 06:34:17 +00:00
Jerry Zhang	c42de84708	[quant] Skip some x86 quantizer tests for now due to time out (#104666 ) Summary: att Test Plan: sandcastle ci Reviewed By: malfet Differential Revision: D47234616 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104666 Approved by: https://github.com/DanilBaibak	2023-07-06 17:34:13 +00:00
leslie-fang-intel	8e2e2d730e	[Quant][PT2E]Accelerate test of conv2d_add and conv2d_add_relu by reducing test configs (#104686 ) Summary Reduce the test time of `test_conv2d_binary_with_quantizer_api` and `test_conv2d_binary_unary_with_quantizer_api`. * For `test_conv2d_binary_with_quantizer_api`, reduce the number of test config from 12 to 2. * For `test_conv2d_binary_unary_with_quantizer_api`, reduce the number of test config from 24 to 2. Test Plan ``` python -m pytest test_x86inductor_quantizer.py -k test_conv2d_binary_with_quantizer_api python -m pytest test_x86inductor_quantizer.py -k test_conv2d_binary_unary_with_quantizer_api ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/104686 Approved by: https://github.com/jerryzh168	2023-07-06 07:34:46 +00:00
leslie-fang-intel	2a21469a77	[Quant][PT2E] Enable conv2d unary and binary recipe for x86 inductor quantizer (#98826 ) Summary - Recipe to annotate `conv2d_relu` for `X86InductorQuantizer` is added. - Recipe to annotate `conv2d_add` for `X86InductorQuantizer` is added. - Recipe to annotate `conv2d_add_relu` for `X86InductorQuantizer` is added. Test Plan ``` python -u -m pytest -s -v test_x86inductor_quantizer.py -k TestQuantizePT2EX86Inductor ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/98826 Approved by: https://github.com/jerryzh168	2023-07-04 00:01:10 +00:00
leslie-fang-intel	dbc8eb2a8f	[Quant][PT2E]Enable x86 inductor quantizer (#98730 ) Summary - Enable `X86InductorQuantizer` basics. - Recipe to annotate conv2d is added. Test Plan ``` python -u -m pytest -s -v test_x86inductor_quantizer.py -k TestQuantizePT2EX86Inductor ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/98730 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-06-17 06:10:23 +00:00

21 Commits