pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-21 13:44:15 +08:00

Author	SHA1	Message	Date
Emily Shen	07da584dbd	Fix KeyError returned by _maybe_get_last_node_only_observer (#58443 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58443 Test Plan: arc lint Reviewed By: vkuzo Differential Revision: D28494119 fbshipit-source-id: 05abf4e12051afc237096812fb0ee08a8b9447f9	2021-05-18 12:41:19 -07:00
Vasiliy Kuznetsov	4f50fdc2a3	fx quant: refactor observer insertion Summary: tl;dr; rewrites the FX graph mode quantization observer insertion to be easier to understand and extend. The key conceptual difference from before is: * before: for each node, observers are always inserted to the output of the current node, even if they are needed for the next node. This is hard to reason about. * after: for each node, observers are inserted to the inputs (if needed, as calculated by the dtype of the argument and dtype of current node) and to the output (if needed for the type of pattern and qconfig). There is no knowledge of future nodes needed to insert observers for the current node. This allows us to significantly simplify various things: * all new observers needed for a node are inserted together. This makes it easier to understand and debug things. We add an invariant that node X will never change any observers inserted by any preceding or subsequent node, so to debug an issue the user can just understand what is happening for node X, without having to understand what happens before or after it. * all the state tracking of activation_post_process_map and activation_post_process_indices are removed, instead observers are looked up by graph traversals * since there is no longer a need for overlapping graph passes which mutate each other's interemediate state, it is easier to understand what the rules are for inserting observers, and to create new rules in the future. Test Plan: ``` # all OSS tests pass python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps ``` Imported from OSS Differential Revision: D28241864 Reviewed By: jerryzh168 Pulled By: vkuzo fbshipit-source-id: 950d58972d26362808564cc0a2dfb30413a3734d	2021-05-15 09:51:33 -07:00
Vasiliy Kuznetsov	7c3a30fd79	fx quant: remove matching hack for binary qhandler (#57470 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57470 Removes the earlier hack of matching patterns originally matched to BinaryOpQuantizeHandler to switch to CopyHandler. After this PR, each pattern can only be matched to one type of QuantizeHandler or to nothing. Test Plan: ``` python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D28152909 fbshipit-source-id: afc285e770bd7eb0518c90e3ee4874c421e78bbc	2021-05-04 16:38:56 -07:00
Vasiliy Kuznetsov	643f41be61	fx quant: remove FixedQParamsOpQuantizeHandler from quantize.py (#57393 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57393 Moves the information on whether we should pass the information whether the output is quantized based on the inputs to live on the qhandler object. This allows us to remove FixedQParamsOpQuantizeHandler from quantize.py, further reducing the coupling between handler objects and the quantization pass. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps ``` Imported from OSS Reviewed By: astaff Differential Revision: D28132414 fbshipit-source-id: 5c28524b47c00f618d3a38657376abae9e6ffe7c	2021-05-02 20:13:10 -07:00
Vasiliy Kuznetsov	2bd158386a	fx quant: move input_output_observed to qhandler (#57388 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57388 It's a bit confusing to have this be a decorator. It's simpler to just expose it as a function on qhandler. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D28129411 fbshipit-source-id: f7316f285e8546c67e8d8cf753462b2c2abb2636	2021-05-02 20:13:08 -07:00
Vasiliy Kuznetsov	1b20eeb138	fx quant: move output obs logic to QuantizeHandler (#57377 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57377 Moves the logic which determines 1. whether a pattern instance's output should be observed 2. whether a pattern instance's output should be marked as observed based on its inputs 3. whether to ovverride the activation specified in the qconfig from `quantize.py` to `quantization_patterns.py`. This makes the code easier to read and reduces the coupling between `Quantizer` and `QuantizeHandler` instances. Note: there are some further cleanups which would be good after this one - leaving those for future PRs. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D28126896 fbshipit-source-id: 94c80a9c7307452783348d65b402acc84983e3f6	2021-05-02 20:13:07 -07:00
Jerry Zhang	096089abcb	[quant][graphmode][fx] Produce torch.cat instead of torch.ops.quantized.cat (#54924 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54924 Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing the same observer/fakequant instance). Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant. Test Plan: python test/test_quantization.py TestQuantizeFx.test_cat Imported from OSS Reviewed By: vkuzo Differential Revision: D27416528 fbshipit-source-id: 896c280abec2903c29d597c655729666583ff0dd	2021-04-21 10:58:09 -07:00
Sam Estep	75024e228c	Add lint for unqualified `type: ignore` (#56290 ) Summary: The other half of https://github.com/pytorch/pytorch/issues/56272. Pull Request resolved: https://github.com/pytorch/pytorch/pull/56290 Test Plan: CI should pass on the tip of this PR, and we know that the lint works because the following CI runs (before this PR was finished) failed: - https://github.com/pytorch/pytorch/runs/2384511062 - https://github.com/pytorch/pytorch/actions/runs/765036024 Reviewed By: seemethere Differential Revision: D27867219 Pulled By: samestep fbshipit-source-id: e648f07b6822867e70833e23ddafe7fb7eaca235	2021-04-21 08:07:23 -07:00
Charles David Hernandez	6e1fc5cef8	[quant] added dq->op->q quantization patterns for GELU and softmax ops (#56004 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56004 added reference pattern support for GELU, softmax and bmm for int dtypes. For GELU and Softmax, this consisted of adding reference patterns to the default node handler for int dtypes. Note GELU and softmax patterns are not registered since they do not have a proper quantized kernel which means they would either add unnecessary dequant and quant ops to the network, or they would simply error. This can be circumvented with custom qconfig usage as in test_gelu_reference bmm was added within binary ops along with some significant changes to how that code is structured. Theoretically the reference pattern used for bmm could be applied to other dtypes. This was not enabled because of issues relating to Line 1323 in quantize.py. In essence, the prepare step does not know whether an op will use a reference pattern or not, so for ops that are supported with one dtype in reference and one dtype normally, this has the potential to cause issues. This is difficult to get aorund with the is_reference flag being available in the prepare step or discussed changes around separating Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_gelu_reference python test/test_quantization.py TestQuantizeFxOps.ttest_gelu_normal python test/test_quantization.py TestQuantizeFxOps.test_softmax_reference python test/test_quantization.py TestQuantizeFxOps.test_softmax_normal python test/test_quantization.py TestQuantizeFxOps.test_silu_reference python test/test_quantization.py TestQuantizeFxOps.test_bmm_int_reference python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFuseFx python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxModels Imported from OSS Reviewed By: raghuramank100 Differential Revision: D27818340 fbshipit-source-id: de65be0797035463cd2d1b0e4677d1a87f69143c	2021-04-20 13:26:15 -07:00
Vasiliy Kuznetsov	8fc1ca0d22	fx quant: fix prepacking for F.conv1d (#55311 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55311 Before this PR, `F.conv1d` was matched by FX graph mode quant patterns but the prepacking was happening inline. There was also a bug with argument type mismatch. This PR fixes both issues and adds a test. Thanks jerryzh168 for the code tip. Test Plan: ``` python test/test_quantization.py TestQuantizeFx.test_functional_not_reference ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D27575422 fbshipit-source-id: 42301e23cb101a9e64e46800813bc771317e233e	2021-04-14 09:04:28 -07:00
Jerry Zhang	c96b5b2a20	[quant][graphmode][fx][fix] Fix fp16 reference patterns for linear (#55727 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55727 number of dequantize for fp16 reference pattern was incorrect before, this PR fixes the problem Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D27713390 fbshipit-source-id: 72b8d4cda0bdcea74abe27a76f918d1b47819b01	2021-04-13 23:19:45 -07:00
Jerry Zhang	4d449f915f	[quant][graphmode][fx] Separate handling Copy operator to a helper function (#54644 ) (#55429 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55429 Previously we special case copy operator in normal insert observer code, this PR tries to split the special case logic to a separate function and keep the rest of the code clean. Test Plan: Imported from OSS Imported from OSS Reviewed By: vkuzo Differential Revision: D27609972 fbshipit-source-id: 378f6aa70f18c0b477b62b6efe236648748aae7e	2021-04-08 22:12:24 -07:00
Bradley Davis	8eaa4a97b7	Back out "[quant][graphmode][fx] Separate handling Copy operator to a helper function" (#55388 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55388 temporarily revert D27314678 (`c57541ce06`), it appears to cause a perf regression that makes quantization of some models take too long to complete tests. Reviewed By: houseroad Differential Revision: D27583809 fbshipit-source-id: e9c088ccbfd3bfb3a1d4c7eafee3eca29ee7717b	2021-04-06 14:20:36 -07:00
Jerry Zhang	c57541ce06	[quant][graphmode][fx] Separate handling Copy operator to a helper function (#54644 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54644 Previously we special case copy operator in normal insert observer code, this PR tries to split the special case logic to a separate function and keep the rest of the code clean. Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D27314678 fbshipit-source-id: d36870ceb3717bc01eaeaa6f3f1532ad562cbaf1	2021-03-31 17:50:32 -07:00
Jerry Zhang	c0d6dbdce4	[quant][fx][graphmode][refactor] Change activation_post_process_map to track the observer name instead (#54643 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54643 A refactor needed for future changes. Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D27314677 fbshipit-source-id: 972fbfb506f86da13f8817b3eaa5e6d0ad16ffe1	2021-03-31 17:50:30 -07:00
Jerry Zhang	55544cb13a	[quant][graphmode][fx] Add support for one value being quantized with different qconfigs (#53586 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53586 Previously one value can only be quantized to one dtype, this PR adds the support for quantizing one value in the fx graph with multiple dtypes, e.g. first quantize to int8 and then float16 might do some followup PRs to clean up the hacks and refactor the code. Test Plan: python test/test_quantization.py TestQuantizeFx.test_multiple_qconfigs_single_value Imported from OSS Reviewed By: vkuzo Differential Revision: D26912676 fbshipit-source-id: ae3653fd67f05870a3a9e808f491871826c555d5	2021-03-31 17:48:50 -07:00
Vasiliy Kuznetsov	4884a6ab51	fx quant: clean up names of quantize handlers (#53614 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53614 Ensures that every subclass of `QuantizeHandler` has a clear name. This prevents ambiguous names like `Cat`, which look like a module but are really a quantize handler. Test Plan: ``` python test/test_quantization.py TestQuantizeFx ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26914784 fbshipit-source-id: 6dca7e27975c09f422f8e36f1d2b709bf3eaaadf	2021-03-12 07:43:53 -08:00
Vasiliy Kuznetsov	279b5372ab	[not for land] fix fx quant for quant_layer -> stack -> sum (#53196 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53196 Before this PR, code patterns like this did not work: ``` x = some_quant_layer(x) x = torch.stack([x, ...]) x = torch.sum(x, ...) ``` The reason this did not work is because `torch.sum` is treated as "quantized" because of the newly added fp16 support, even though it is not actually "quantized" for models where fp16 is not used. We may need to adjust the concept of "quantized vs non-quantized" into a "dtype" for the longer term fix. The current PR is a hacky fix to unblock. We need to clean things up before this is landable Test Plan: ``` python test/test_quantization.py TestQuantizeFx.test_quant_sum ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26783960 fbshipit-source-id: 3be7c3c1eaa2b8fcb99a105e1b0004c9ffd3a1c1	2021-03-12 07:43:50 -08:00
Vasiliy Kuznetsov	ccab6680d5	[not for land yet] hacky fix for x.ndim followed by sub (#53120 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53120 Currently there is a pattern which is not handled correctly by FX graph mode quantization: ``` def forward(self, x): ndim = x.ndim # or add, mul, div, etc x = torch.sub(x, ndim) return x ``` The reason this does not work is as follows: 1. x.ndim becomes a getattr node 2. the real world type of x.ndim is an integer, but this is not known from the graph (yet) 3. binary ops such as `torch.sub` require quantization of inputs 4. the framework inserts an observer to observe the output of `ndim` 5. the observer fails because `ndim` is not a Tensor For now, we hack a bandaid to unblock some teams, none of this is for land. We will have to think of a better fix which is landable (TBD). Test Plan: ``` python test/test_quantization.py TestQuantizeFx.test_getattr_with_nontensor_result ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26756180 fbshipit-source-id: c0e498766b22c23df74fbb5aaeaa237c4c944263	2021-03-12 07:42:12 -08:00
Jerry Zhang	7484c56fa3	[quant][graphmode][fx] Fix a condition check for CopyNode (#53585 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53585 Previously fp16_static CopyNode would be marked as unquantized because of an incorrect condition check of whether a Node is statically quantized or not. This PR fixes that. Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D26912677 fbshipit-source-id: 4ddb538714c5ba2db28430de5e1cf2931baf1993	2021-03-11 09:32:20 -08:00
hyperfraise	f9185973d1	[quantization] Add some support for 3d operations (#50003 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/50002 The last commit adds tests for 3d conv with the `SubModelFusion` and `SubModelWithoutFusion` classes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50003 Reviewed By: mrshenli Differential Revision: D26325953 Pulled By: jerryzh168 fbshipit-source-id: 7406dd2721c0c4df477044d1b54a6c5e128a9034	2021-03-10 16:40:35 -08:00
Jerry Zhang	46bd76fdec	[quant][graphmode][fx][fp16] Add fp16 support for silu (#52865 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52865 Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_silu Imported from OSS Reviewed By: vkuzo Differential Revision: D26672270 fbshipit-source-id: a6a6ab58c347a56f0ded612b2e0a3e2230a91d9e	2021-03-02 02:11:29 -08:00
Jerry Zhang	d40b501cfc	[quant][graphmode][fx][fp16] Add fp16 support for sigmoid (#52863 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52863 Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_fixed_qparams_ops_fp16 Imported from OSS Reviewed By: vkuzo Differential Revision: D26672273 fbshipit-source-id: 30d5befe2a24081ac12ac773df4d2bd26d2d0192	2021-03-02 02:11:21 -08:00
Jerry Zhang	3fb324f05b	[quant][graphmode][fx][fp16] Add fp16 support for layer_norm (#52862 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52862 Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_layer_norm Imported from OSS Reviewed By: vkuzo Differential Revision: D26672272 fbshipit-source-id: 4cfdce986efa98db7dc58bf2a62b650e45a69ed0	2021-03-02 02:11:17 -08:00
Jerry Zhang	fc6fdade9f	[quant][graphmode][fx][fp16] Add fp16 support for torch.sum (#52811 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52811 Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_sum Imported from OSS Reviewed By: vkuzo Differential Revision: D26655619 fbshipit-source-id: 642e0de47d0da7bd1abe1e981819de33e84c32f3	2021-03-02 02:11:13 -08:00
Jerry Zhang	97c51d5d5d	[quant][graphmode][fx][fp16] Add fp16 support for div (#52810 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52810 Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_div Imported from OSS Reviewed By: vkuzo Differential Revision: D26655620 fbshipit-source-id: e46cb895ba456e99e4433bd6037229b8248a1b28	2021-03-02 02:11:08 -08:00
Jerry Zhang	a6af93e921	[quant][graphmode][fx][fp16] Add fp16 support for sub (#52809 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52809 Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_sub Imported from OSS Reviewed By: vkuzo Differential Revision: D26655618 fbshipit-source-id: b47966ee1b75a2f814b9019d8d16b2da2212f5da	2021-03-02 02:09:07 -08:00
Jerry Zhang	991160ebd9	[quant][graphmode][fx] Add support for fp16 bmm pattern (#52808 ) (#53021 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53021 Add support for producing fp16 bmm pattern Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_bmm Imported from OSS Imported from OSS Reviewed By: vkuzo Differential Revision: D26725349 fbshipit-source-id: debee718fc33e562aff3f5664757bb52ee85f651	2021-03-01 14:45:55 -08:00
Jerry Zhang	096bea5251	[reland][quant][graphmode][fx][fp16] Add fp16 support for {add\|mul}{_relu} (#52714 ) (#53019 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53019 Test Plan: python test/test_quantization.py TestQuantizedOps.test_add python test/test_quantization.py TestQuantizedOps.test_mul python test/test_quantization.py TestQuantizedOps.test_add_relu python test/test_quantization.py TestQuantizedOps.test_mul_relu Imported from OSS Imported from OSS Reviewed By: vkuzo Differential Revision: D26725350 fbshipit-source-id: 2a89f5da6a21908f454f870521d2a4549fdd291e	2021-03-01 13:19:42 -08:00
Mike Ruberry	312b297b82	Revert D26626092: [quant][graphmode][fx][fp16] Add fp16 support for {add\|mul}{_relu} Test Plan: revert-hammer Differential Revision: D26626092 (`2962fbb03c`) Original commit changeset: 91d040efa51e fbshipit-source-id: cc6bcc0f451d6adcd7bf7572451e6e3cd6ad59d1	2021-03-01 04:52:47 -08:00
Mike Ruberry	3a024a7ae2	Revert D26655616: [quant][graphmode][fx] Add support for fp16 bmm pattern Test Plan: revert-hammer Differential Revision: D26655616 (`2c44b256d8`) Original commit changeset: 1d0639303e5c fbshipit-source-id: 403429c706c8a9e6a657669daf8aadf282025f83	2021-03-01 04:50:35 -08:00
Jerry Zhang	2c44b256d8	[quant][graphmode][fx] Add support for fp16 bmm pattern (#52808 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52808 Add support for producing fp16 bmm pattern Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_bmm Imported from OSS Reviewed By: vkuzo Differential Revision: D26655616 fbshipit-source-id: 1d0639303e5ca2ca4ceae08d03ebc3b25256de57	2021-02-28 16:48:41 -08:00
Jerry Zhang	2962fbb03c	[quant][graphmode][fx][fp16] Add fp16 support for {add\|mul}{_relu} (#52714 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52714 Test Plan: python test/test_quantization.py TestQuantizedOps.test_add python test/test_quantization.py TestQuantizedOps.test_mul python test/test_quantization.py TestQuantizedOps.test_add_relu python test/test_quantization.py TestQuantizedOps.test_mul_relu Imported from OSS Reviewed By: vkuzo Differential Revision: D26626092 fbshipit-source-id: 91d040efa51e9c955eb688ec16a30f0c12233958	2021-02-27 22:12:10 -08:00
Jerry Zhang	0818dbf49d	[quant][refactor] Merge add and mul handler (#52651 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52651 Merging them for easier extensions to fp16 and more binary ops Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D26600118 fbshipit-source-id: a1816e593cf3065afe87d2e6e44cdace13bf6aeb	2021-02-27 19:56:32 -08:00
Jerry Zhang	b685864f50	[quant][graphmode][fx] Add reference option support for linear_static_fp16 (#52650 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52650 linear_dynamic_fp16 has following dtypes for activation, weight, bias, output: (fp32, fp16, fp32, fp32) linear_static_fp16 has following dtypes: (fp16, fp16, fp16, fp16) Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D26599803 fbshipit-source-id: b4a8345d355125070be718a227288cc848cc8bbc	2021-02-27 08:25:44 -08:00
Jerry Zhang	177694681e	[quant][graphmode][fx] Add reference option support for linear_dynamic_fp16 (#52534 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52534 Currently linear_dynamic_fp16 has a signature that's tied to fbgemm/qnnpack We'll need to produce a pattern equivalent to linear_dynamic_fp16 to support extensions to other backends Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_linear_dynamic_fp16 Imported from OSS Reviewed By: vkuzo Differential Revision: D26557726 fbshipit-source-id: 270c9f781f73c79416a092b7831294cabca84b0c	2021-02-26 21:12:22 -08:00
Jerry Zhang	626756ac39	[quant][graphmode][api] debug --> reference (#52179 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52179 Rename debug to reference. We'll use this to produce a reference quantized model that can be used as a common interface between pytorch quantized model and backends. Test Plan: python test/test_quantization.py TestQuantizeFx Imported from OSS Reviewed By: vkuzo Differential Revision: D26424656 fbshipit-source-id: a0299b023f6ba7d98f5750724c517b0ecb987b35	2021-02-19 14:20:01 -08:00
Jerry Zhang	0c0de542be	[quant][graphmode][fx] Guard the supported quantization type for add/mul (#52413 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52413 TODO: We'll need to add this guard for other ops as well (Note: this ignores all push blocking failures!) Test Plan: python test/test_quantization.py TestQuantizeFx.test_mul_add_fp16_config Imported from OSS Reviewed By: supriyar Differential Revision: D26503348 fbshipit-source-id: 5aaba518742a516cc3521fd5f23f1a264d2973e2	2021-02-19 12:56:22 -08:00
Jerry Zhang	fb9f89507a	[quant][graphmode][fx] Fix fp16 dynamic quant for functional linear (#52369 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52369 Test Plan: Imported from OSS Reviewed By: supriyar Differential Revision: D26491425 fbshipit-source-id: d2c2a70bf1bc43ac2b63ac4cf9ae9c07887f12e9	2021-02-18 23:05:30 -08:00
Supriya Rao	916af892b3	[quant][fx] Update name of packed weight attributes (#51259 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51259 Store the FQN of the module that is using the packed weights (the quantized op) In the case of fusion we update the scope mapping to store the module path of the fused node. Test Plan: python test/test_quantization.py test_packed_weight_fused_op Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26117964 fbshipit-source-id: 9d929997baafb1c91063dd9786a451b0040ae461	2021-01-28 20:31:08 -08:00
Jerry Zhang	7097c0d4f3	[quant][graphmode][fx] Add support for functional conv1d and conv3d (#51155 ) (#51254 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51254 This PR added support for quantizing functional conv1d, conv3d, conv1d_relu and conv3d_relu Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_functional_conv Imported from OSS Reviewed By: vkuzo Differential Revision: D26116172 fbshipit-source-id: 56e7d799c11963fe59ee3a1b6eb23f52007b91dc	2021-01-28 14:32:32 -08:00
Supriya Rao	288b94a8ee	[quant][fx] Make scale, zero_point buffers in the model, use FQN (for quantize_per_tensor ops) (#51171 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51171 Following up on previous PR, this PR makes scale and zero_point for quantize_per_tensor to be registered as buffers in the module. Currently the dtype is still stored as attr (not registered as buffer) since we can only register tensor types. Test Plan: python test/test_quantization.py test_qparams_buffers Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26092964 fbshipit-source-id: a54d914db7863402f2b5a3ba2c8ce8b27c18b47b	2021-01-28 08:35:46 -08:00
Supriya Rao	4c3f59b70e	[quant][fx] Make scale, zero_point buffers in the model and use FQN (for quantized ops) (#51166 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51166 Currently scale and zero_point values are stored as constant values in the graph. This prevents these values from being updated in the graph and also does not enable saving these values to state_dict After this PR we store scale/zero_point values for quantized ops as buffers in the root module and createe get_attr nodes for them in the graph. We also use the FQN of the module where the quantized ops are present to name these attributes so that they can be uniquely identified and mapped to quantized ops. Test Plan: python test/test_quantization.py TestQuantizeFx.test_qparams_buffers Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26092965 fbshipit-source-id: b549b2d3dccb45c5d38415ce95a09c26f5bd590b	2021-01-28 08:35:42 -08:00
Mike Ruberry	f7e90cf311	Revert D26089965: [quant][graphmode][fx] Add support for functional conv1d and conv3d Test Plan: revert-hammer Differential Revision: D26089965 (`dd1a97b3ae`) Original commit changeset: 4aea507d05b7 fbshipit-source-id: f54184cafb9dd07858683489d8bd147474e7e4b3	2021-01-27 13:27:10 -08:00
Jerry Zhang	dd1a97b3ae	[quant][graphmode][fx] Add support for functional conv1d and conv3d (#51155 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51155 This PR added support for quantizing functional conv1d, conv3d, conv1d_relu and conv3d_relu Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_functional_conv Imported from OSS Reviewed By: vkuzo Differential Revision: D26089965 fbshipit-source-id: 4aea507d05b744807e993f6d3711ab308fb7591b	2021-01-27 12:00:35 -08:00
Jerry Zhang	d3ec204ef2	[quant][graphmode][fx] Add functional conv2d + relu (#51079 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51079 Added support for functional conv2d + relu, will add conv1d and conv3d in future PR Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_functional_conv Imported from OSS Reviewed By: vkuzo Differential Revision: D26089964 fbshipit-source-id: 8703de17de1469f7076651c386c83fb5922a56eb	2021-01-27 11:20:55 -08:00
Jerry Zhang	28869d5a80	[quant][graphmode][fx] Add support for quantizing functional linear + {functional relu/module relu} (#50975 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50975 Test Plan: Imported from OSS Reviewed By: supriyar Differential Revision: D26032532 fbshipit-source-id: a084fb4fd711ad52b2da1c6378cbcc2b352976c6	2021-01-25 12:49:58 -08:00
Jerry Zhang	f6f0fde841	[reland][quant][graphmode][fx] Standalone module support {input/output}_quantized_idxs (#49754 ) (#50058 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50058 This PR adds the support for {input/output}_quantized_idxs for standalone module. if input_quantized_idxs = [] and output_quantized_idxs = [], the standalone module will be expecting float input and produce float output, and will quantize the input and dequantize output internally if input_quantized_idxs = [0] and otuput_qiuantized_idxs = [0], the standalone module will be expecting quantized input and produce quantized output, the input will be quantized in the parent module, and output will be dequantized in the parent module as well, this is similar to current quantized modules like nn.quantized.Conv2d For more details, please see the test case Test Plan: python test/test_quantization.py TestQuantizeFx.test_standalone_module Imported from OSS Imported from OSS Reviewed By: vkuzo Differential Revision: D25768910 fbshipit-source-id: 96c21a3456cf192c8f1400afa4e86273ee69197b	2021-01-05 20:27:46 -08:00
Vasiliy Kuznetsov	ea558b2135	fx quant: hook up ConvTranspose{n}d (#49717 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49717 Quantization of `ConvTranpose{n}d` is supported in Eager mode. This PR adds the support for FX graph mode. Note: this currenlty only works in `qnnpack` because per-channel weights are not supported by quantized conv transpose. In a future PR we should throw an error when someone tries to quantize a ConvTranspose model with per-channel weight observers until this is fixed. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps.test_conv_transpose_1d python test/test_quantization.py TestQuantizeFxOps.test_conv_transpose_2d ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D25674636 fbshipit-source-id: b6948156123ed55db77e6337bea10db956215ae6	2020-12-28 14:27:07 -08:00
Mike Ruberry	46cf6d332f	Revert D25684692: [quant][graphmode][fx] Standalone module support {input/output}_quantized_idxs Test Plan: revert-hammer Differential Revision: D25684692 (`89b4899ea5`) Original commit changeset: 900360e01c0e fbshipit-source-id: 8b65fa8fbc7b364fbddb5f23cc696cd9b7db98cd	2020-12-24 15:50:52 -08:00

1 2

90 Commits