Commit Graph

90 Commits

Author SHA1 Message Date
07da584dbd Fix KeyError returned by _maybe_get_last_node_only_observer (#58443)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58443

Test Plan: arc lint

Reviewed By: vkuzo

Differential Revision: D28494119

fbshipit-source-id: 05abf4e12051afc237096812fb0ee08a8b9447f9
2021-05-18 12:41:19 -07:00
4f50fdc2a3 fx quant: refactor observer insertion
Summary:
tl;dr; rewrites the FX graph mode quantization observer insertion to be easier to understand and extend.
The key conceptual difference from before is:
* before: for each node, observers are always inserted to the output of the current node, even if they are needed for the next node. This is hard to reason about.
* after: for each node, observers are inserted to the inputs (if needed, as calculated by the dtype of the argument and dtype of current node) and to the output (if needed for the type of pattern and qconfig).  There is no knowledge of future nodes needed to insert observers for the current node.

This allows us to significantly simplify various things:
* all new observers needed for a node are inserted together.  This makes it easier to understand and debug things.  We add an invariant that node X will never change any observers inserted by any preceding or subsequent node, so to debug an issue the user can just understand what is happening for node X, without having to understand what happens before or after it.
* all the state tracking of activation_post_process_map and activation_post_process_indices are removed, instead observers are looked up by graph traversals
* since there is no longer a need for overlapping graph passes which mutate each other's interemediate state, it is easier to understand what the rules are for inserting observers, and to create new rules in the future.

Test Plan:
```
# all OSS tests pass
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```

Imported from OSS

Differential Revision: D28241864

Reviewed By: jerryzh168

Pulled By: vkuzo

fbshipit-source-id: 950d58972d26362808564cc0a2dfb30413a3734d
2021-05-15 09:51:33 -07:00
7c3a30fd79 fx quant: remove matching hack for binary qhandler (#57470)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57470

Removes the earlier hack of matching patterns originally matched
to BinaryOpQuantizeHandler to switch to CopyHandler. After this PR,
each pattern can only be matched to one type of QuantizeHandler or
to nothing.

Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D28152909

fbshipit-source-id: afc285e770bd7eb0518c90e3ee4874c421e78bbc
2021-05-04 16:38:56 -07:00
643f41be61 fx quant: remove FixedQParamsOpQuantizeHandler from quantize.py (#57393)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57393

Moves the information on whether we should pass the information
whether the output is quantized based on the inputs to live
on the qhandler object.  This allows us to remove
FixedQParamsOpQuantizeHandler from quantize.py, further reducing
the coupling between handler objects and the quantization pass.

Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps
```

Imported from OSS

Reviewed By: astaff

Differential Revision: D28132414

fbshipit-source-id: 5c28524b47c00f618d3a38657376abae9e6ffe7c
2021-05-02 20:13:10 -07:00
2bd158386a fx quant: move input_output_observed to qhandler (#57388)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57388

It's a bit confusing to have this be a decorator. It's simpler to
just expose it as a function on qhandler.

Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D28129411

fbshipit-source-id: f7316f285e8546c67e8d8cf753462b2c2abb2636
2021-05-02 20:13:08 -07:00
1b20eeb138 fx quant: move output obs logic to QuantizeHandler (#57377)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57377

Moves the logic which determines
1. whether a pattern instance's output should be observed
2. whether a pattern instance's output should be marked as observed based on its inputs
3. whether to ovverride the activation specified in the qconfig

from `quantize.py` to `quantization_patterns.py`.  This makes
the code easier to read and reduces the coupling between `Quantizer`
and `QuantizeHandler` instances.

Note: there are some further cleanups which would be good after this one
- leaving those for future PRs.

Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D28126896

fbshipit-source-id: 94c80a9c7307452783348d65b402acc84983e3f6
2021-05-02 20:13:07 -07:00
096089abcb [quant][graphmode][fx] Produce torch.cat instead of torch.ops.quantized.cat (#54924)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54924

Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D27416528

fbshipit-source-id: 896c280abec2903c29d597c655729666583ff0dd
2021-04-21 10:58:09 -07:00
75024e228c Add lint for unqualified type: ignore (#56290)
Summary:
The other half of https://github.com/pytorch/pytorch/issues/56272.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56290

Test Plan:
CI should pass on the tip of this PR, and we know that the lint works because the following CI runs (before this PR was finished) failed:

- https://github.com/pytorch/pytorch/runs/2384511062
- https://github.com/pytorch/pytorch/actions/runs/765036024

Reviewed By: seemethere

Differential Revision: D27867219

Pulled By: samestep

fbshipit-source-id: e648f07b6822867e70833e23ddafe7fb7eaca235
2021-04-21 08:07:23 -07:00
6e1fc5cef8 [quant] added dq->op->q quantization patterns for GELU and softmax ops (#56004)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56004

added reference pattern support for GELU, softmax and bmm for int dtypes. For GELU and Softmax, this consisted of adding reference patterns to the default node handler for int dtypes. Note GELU and softmax patterns are not registered since they do not have a proper quantized kernel which means they would either add unnecessary dequant and quant ops to the network, or they would simply error. This can be circumvented with custom qconfig usage as in test_gelu_reference

bmm was added within binary ops along with some significant changes to how that code is structured. Theoretically the reference pattern used for bmm could be applied to other dtypes. This was not enabled because of issues relating to Line 1323 in quantize.py. In essence, the prepare step does not know whether an op will use a reference pattern or not, so for ops that are supported with one dtype in reference and one dtype normally, this has the potential to cause issues. This is difficult to get aorund with the is_reference flag being available in the prepare step or discussed changes around separating

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_gelu_reference
python test/test_quantization.py TestQuantizeFxOps.ttest_gelu_normal
python test/test_quantization.py TestQuantizeFxOps.test_softmax_reference
python test/test_quantization.py TestQuantizeFxOps.test_softmax_normal
python test/test_quantization.py TestQuantizeFxOps.test_silu_reference
python test/test_quantization.py TestQuantizeFxOps.test_bmm_int_reference
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestFuseFx
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxModels

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D27818340

fbshipit-source-id: de65be0797035463cd2d1b0e4677d1a87f69143c
2021-04-20 13:26:15 -07:00
8fc1ca0d22 fx quant: fix prepacking for F.conv1d (#55311)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55311

Before this PR, `F.conv1d` was matched by FX graph mode quant patterns
but the prepacking was happening inline.  There was also a bug with
argument type mismatch.

This PR fixes both issues and adds a test. Thanks jerryzh168 for the
code tip.

Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_functional_not_reference
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D27575422

fbshipit-source-id: 42301e23cb101a9e64e46800813bc771317e233e
2021-04-14 09:04:28 -07:00
c96b5b2a20 [quant][graphmode][fx][fix] Fix fp16 reference patterns for linear (#55727)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55727

number of dequantize for fp16 reference pattern was incorrect before, this
PR fixes the problem

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D27713390

fbshipit-source-id: 72b8d4cda0bdcea74abe27a76f918d1b47819b01
2021-04-13 23:19:45 -07:00
4d449f915f [quant][graphmode][fx] Separate handling Copy operator to a helper function (#54644) (#55429)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55429

Previously we special case copy operator in normal insert observer code, this PR tries to split the
special case logic to a separate function and keep the rest of the code clean.

Test Plan:
Imported from OSS

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D27609972

fbshipit-source-id: 378f6aa70f18c0b477b62b6efe236648748aae7e
2021-04-08 22:12:24 -07:00
8eaa4a97b7 Back out "[quant][graphmode][fx] Separate handling Copy operator to a helper function" (#55388)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55388

temporarily revert D27314678 (c57541ce06), it appears to cause a perf regression that makes quantization of some models take too long to complete tests.

Reviewed By: houseroad

Differential Revision: D27583809

fbshipit-source-id: e9c088ccbfd3bfb3a1d4c7eafee3eca29ee7717b
2021-04-06 14:20:36 -07:00
c57541ce06 [quant][graphmode][fx] Separate handling Copy operator to a helper function (#54644)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54644

Previously we special case copy operator in normal insert observer code, this PR tries to split the
special case logic to a separate function and keep the rest of the code clean.

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D27314678

fbshipit-source-id: d36870ceb3717bc01eaeaa6f3f1532ad562cbaf1
2021-03-31 17:50:32 -07:00
c0d6dbdce4 [quant][fx][graphmode][refactor] Change activation_post_process_map to track the observer name instead (#54643)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54643

A refactor needed for future changes.

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D27314677

fbshipit-source-id: 972fbfb506f86da13f8817b3eaa5e6d0ad16ffe1
2021-03-31 17:50:30 -07:00
55544cb13a [quant][graphmode][fx] Add support for one value being quantized with different qconfigs (#53586)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53586

Previously one value can only be quantized to one dtype, this PR adds the support for quantizing one value
in the fx graph with multiple dtypes, e.g. first quantize to int8 and then float16

might do some followup PRs to clean up the hacks and refactor the code.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_multiple_qconfigs_single_value

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26912676

fbshipit-source-id: ae3653fd67f05870a3a9e808f491871826c555d5
2021-03-31 17:48:50 -07:00
4884a6ab51 fx quant: clean up names of quantize handlers (#53614)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53614

Ensures that every subclass of `QuantizeHandler` has a clear name.  This
prevents ambiguous names like `Cat`, which look like a module but are
really a quantize handler.

Test Plan:
```
python test/test_quantization.py TestQuantizeFx
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D26914784

fbshipit-source-id: 6dca7e27975c09f422f8e36f1d2b709bf3eaaadf
2021-03-12 07:43:53 -08:00
279b5372ab [not for land] fix fx quant for quant_layer -> stack -> sum (#53196)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53196

Before this PR, code patterns like this did not work:

```
x = some_quant_layer(x)
x = torch.stack([x, ...])
x = torch.sum(x, ...)
```

The reason this did not work is because `torch.sum` is treated as
"quantized" because of the newly added fp16 support, even though it is
not actually "quantized" for models where fp16 is not used.  We may
need to adjust the concept of "quantized vs non-quantized" into a
"dtype" for the longer term fix.

The current PR is a hacky fix to unblock.  We need to clean things
up before this is landable

Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_quant_sum
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D26783960

fbshipit-source-id: 3be7c3c1eaa2b8fcb99a105e1b0004c9ffd3a1c1
2021-03-12 07:43:50 -08:00
ccab6680d5 [not for land yet] hacky fix for x.ndim followed by sub (#53120)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53120

Currently there is a pattern which is not handled correctly by
FX graph mode quantization:

```
def forward(self, x):
    ndim = x.ndim
    # or add, mul, div, etc
    x = torch.sub(x, ndim)
    return x
```

The reason this does not work is as follows:
1. x.ndim becomes a getattr node
2. the real world type of x.ndim is an integer, but this is not known from the graph (yet)
3. binary ops such as `torch.sub` require quantization of inputs
4. the framework inserts an observer to observe the output of `ndim`
5. the observer fails because `ndim` is not a Tensor

For now, we hack a bandaid to unblock some teams, none of this is for
land.  We will have to think of a better fix which is landable (TBD).

Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_getattr_with_nontensor_result
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D26756180

fbshipit-source-id: c0e498766b22c23df74fbb5aaeaa237c4c944263
2021-03-12 07:42:12 -08:00
7484c56fa3 [quant][graphmode][fx] Fix a condition check for CopyNode (#53585)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53585

Previously fp16_static CopyNode would be marked as unquantized because of
an incorrect condition check of whether a Node is statically quantized or not.
This PR fixes that.

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26912677

fbshipit-source-id: 4ddb538714c5ba2db28430de5e1cf2931baf1993
2021-03-11 09:32:20 -08:00
f9185973d1 [quantization] Add some support for 3d operations (#50003)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/50002

The last commit adds tests for 3d conv with the `SubModelFusion` and `SubModelWithoutFusion` classes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50003

Reviewed By: mrshenli

Differential Revision: D26325953

Pulled By: jerryzh168

fbshipit-source-id: 7406dd2721c0c4df477044d1b54a6c5e128a9034
2021-03-10 16:40:35 -08:00
46bd76fdec [quant][graphmode][fx][fp16] Add fp16 support for silu (#52865)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52865

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_silu

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26672270

fbshipit-source-id: a6a6ab58c347a56f0ded612b2e0a3e2230a91d9e
2021-03-02 02:11:29 -08:00
d40b501cfc [quant][graphmode][fx][fp16] Add fp16 support for sigmoid (#52863)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52863

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_fixed_qparams_ops_fp16

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26672273

fbshipit-source-id: 30d5befe2a24081ac12ac773df4d2bd26d2d0192
2021-03-02 02:11:21 -08:00
3fb324f05b [quant][graphmode][fx][fp16] Add fp16 support for layer_norm (#52862)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52862

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_layer_norm

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26672272

fbshipit-source-id: 4cfdce986efa98db7dc58bf2a62b650e45a69ed0
2021-03-02 02:11:17 -08:00
fc6fdade9f [quant][graphmode][fx][fp16] Add fp16 support for torch.sum (#52811)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52811

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_sum

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26655619

fbshipit-source-id: 642e0de47d0da7bd1abe1e981819de33e84c32f3
2021-03-02 02:11:13 -08:00
97c51d5d5d [quant][graphmode][fx][fp16] Add fp16 support for div (#52810)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52810

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_div

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26655620

fbshipit-source-id: e46cb895ba456e99e4433bd6037229b8248a1b28
2021-03-02 02:11:08 -08:00
a6af93e921 [quant][graphmode][fx][fp16] Add fp16 support for sub (#52809)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52809

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_sub

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26655618

fbshipit-source-id: b47966ee1b75a2f814b9019d8d16b2da2212f5da
2021-03-02 02:09:07 -08:00
991160ebd9 [quant][graphmode][fx] Add support for fp16 bmm pattern (#52808) (#53021)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53021

Add support for producing fp16 bmm pattern

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_bmm

Imported from OSS

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26725349

fbshipit-source-id: debee718fc33e562aff3f5664757bb52ee85f651
2021-03-01 14:45:55 -08:00
096bea5251 [reland][quant][graphmode][fx][fp16] Add fp16 support for {add|mul}{_relu} (#52714) (#53019)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53019

Test Plan:
python test/test_quantization.py TestQuantizedOps.test_add
python test/test_quantization.py TestQuantizedOps.test_mul
python test/test_quantization.py TestQuantizedOps.test_add_relu
python test/test_quantization.py TestQuantizedOps.test_mul_relu

Imported from OSS

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26725350

fbshipit-source-id: 2a89f5da6a21908f454f870521d2a4549fdd291e
2021-03-01 13:19:42 -08:00
312b297b82 Revert D26626092: [quant][graphmode][fx][fp16] Add fp16 support for {add|mul}{_relu}
Test Plan: revert-hammer

Differential Revision:
D26626092 (2962fbb03c)

Original commit changeset: 91d040efa51e

fbshipit-source-id: cc6bcc0f451d6adcd7bf7572451e6e3cd6ad59d1
2021-03-01 04:52:47 -08:00
3a024a7ae2 Revert D26655616: [quant][graphmode][fx] Add support for fp16 bmm pattern
Test Plan: revert-hammer

Differential Revision:
D26655616 (2c44b256d8)

Original commit changeset: 1d0639303e5c

fbshipit-source-id: 403429c706c8a9e6a657669daf8aadf282025f83
2021-03-01 04:50:35 -08:00
2c44b256d8 [quant][graphmode][fx] Add support for fp16 bmm pattern (#52808)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52808

Add support for producing fp16 bmm pattern

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_bmm

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26655616

fbshipit-source-id: 1d0639303e5ca2ca4ceae08d03ebc3b25256de57
2021-02-28 16:48:41 -08:00
2962fbb03c [quant][graphmode][fx][fp16] Add fp16 support for {add|mul}{_relu} (#52714)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52714

Test Plan:
python test/test_quantization.py TestQuantizedOps.test_add
python test/test_quantization.py TestQuantizedOps.test_mul
python test/test_quantization.py TestQuantizedOps.test_add_relu
python test/test_quantization.py TestQuantizedOps.test_mul_relu

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26626092

fbshipit-source-id: 91d040efa51e9c955eb688ec16a30f0c12233958
2021-02-27 22:12:10 -08:00
0818dbf49d [quant][refactor] Merge add and mul handler (#52651)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52651

Merging them for easier extensions to fp16 and more binary ops

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26600118

fbshipit-source-id: a1816e593cf3065afe87d2e6e44cdace13bf6aeb
2021-02-27 19:56:32 -08:00
b685864f50 [quant][graphmode][fx] Add reference option support for linear_static_fp16 (#52650)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52650

linear_dynamic_fp16 has following dtypes for activation, weight, bias, output:
(fp32, fp16, fp32, fp32)

linear_static_fp16 has following dtypes:
(fp16, fp16, fp16, fp16)

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26599803

fbshipit-source-id: b4a8345d355125070be718a227288cc848cc8bbc
2021-02-27 08:25:44 -08:00
177694681e [quant][graphmode][fx] Add reference option support for linear_dynamic_fp16 (#52534)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52534

Currently linear_dynamic_fp16 has a signature that's tied to fbgemm/qnnpack
We'll need to produce a pattern equivalent to linear_dynamic_fp16 to support extensions
to other backends

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_linear_dynamic_fp16

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26557726

fbshipit-source-id: 270c9f781f73c79416a092b7831294cabca84b0c
2021-02-26 21:12:22 -08:00
626756ac39 [quant][graphmode][api] debug --> reference (#52179)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52179

Rename debug to reference. We'll use this to produce a reference quantized model
that can be used as a common interface between pytorch quantized model and backends.

Test Plan:
python test/test_quantization.py TestQuantizeFx

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26424656

fbshipit-source-id: a0299b023f6ba7d98f5750724c517b0ecb987b35
2021-02-19 14:20:01 -08:00
0c0de542be [quant][graphmode][fx] Guard the supported quantization type for add/mul (#52413)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52413

TODO: We'll need to add this guard for other ops as well

(Note: this ignores all push blocking failures!)

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_mul_add_fp16_config

Imported from OSS

Reviewed By: supriyar

Differential Revision: D26503348

fbshipit-source-id: 5aaba518742a516cc3521fd5f23f1a264d2973e2
2021-02-19 12:56:22 -08:00
fb9f89507a [quant][graphmode][fx] Fix fp16 dynamic quant for functional linear (#52369)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52369

Test Plan: Imported from OSS

Reviewed By: supriyar

Differential Revision: D26491425

fbshipit-source-id: d2c2a70bf1bc43ac2b63ac4cf9ae9c07887f12e9
2021-02-18 23:05:30 -08:00
916af892b3 [quant][fx] Update name of packed weight attributes (#51259)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51259

Store the FQN of the module that is using the packed weights (the quantized op)

In the case of fusion we update the scope mapping to store the module path of the fused node.

Test Plan:
python test/test_quantization.py test_packed_weight_fused_op

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D26117964

fbshipit-source-id: 9d929997baafb1c91063dd9786a451b0040ae461
2021-01-28 20:31:08 -08:00
7097c0d4f3 [quant][graphmode][fx] Add support for functional conv1d and conv3d (#51155) (#51254)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51254

This PR added support for quantizing functional conv1d, conv3d,  conv1d_relu and conv3d_relu

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_functional_conv

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26116172

fbshipit-source-id: 56e7d799c11963fe59ee3a1b6eb23f52007b91dc
2021-01-28 14:32:32 -08:00
288b94a8ee [quant][fx] Make scale, zero_point buffers in the model, use FQN (for quantize_per_tensor ops) (#51171)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51171

Following up on previous PR, this PR makes scale and zero_point for quantize_per_tensor to be
registered as buffers in the module.
Currently the dtype is still stored as attr (not registered as buffer) since we can only register tensor types.

Test Plan:
python test/test_quantization.py test_qparams_buffers

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D26092964

fbshipit-source-id: a54d914db7863402f2b5a3ba2c8ce8b27c18b47b
2021-01-28 08:35:46 -08:00
4c3f59b70e [quant][fx] Make scale, zero_point buffers in the model and use FQN (for quantized ops) (#51166)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51166

Currently scale and zero_point values are stored as constant values in the graph.
This prevents these values from being updated in the graph and also does not enable saving
these values to state_dict

After this PR we store scale/zero_point values for quantized ops as buffers in the root module
and createe get_attr nodes for them in the graph.

We also use the FQN of the module where the quantized ops are present to name these attributes so
that they can be uniquely  identified and mapped to quantized ops.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_qparams_buffers

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D26092965

fbshipit-source-id: b549b2d3dccb45c5d38415ce95a09c26f5bd590b
2021-01-28 08:35:42 -08:00
f7e90cf311 Revert D26089965: [quant][graphmode][fx] Add support for functional conv1d and conv3d
Test Plan: revert-hammer

Differential Revision:
D26089965 (dd1a97b3ae)

Original commit changeset: 4aea507d05b7

fbshipit-source-id: f54184cafb9dd07858683489d8bd147474e7e4b3
2021-01-27 13:27:10 -08:00
dd1a97b3ae [quant][graphmode][fx] Add support for functional conv1d and conv3d (#51155)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51155

This PR added support for quantizing functional conv1d, conv3d,  conv1d_relu and conv3d_relu

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_functional_conv

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26089965

fbshipit-source-id: 4aea507d05b744807e993f6d3711ab308fb7591b
2021-01-27 12:00:35 -08:00
d3ec204ef2 [quant][graphmode][fx] Add functional conv2d + relu (#51079)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51079

Added support for functional conv2d + relu, will add conv1d and conv3d in future PR

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_functional_conv

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D26089964

fbshipit-source-id: 8703de17de1469f7076651c386c83fb5922a56eb
2021-01-27 11:20:55 -08:00
28869d5a80 [quant][graphmode][fx] Add support for quantizing functional linear + {functional relu/module relu} (#50975)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50975

Test Plan: Imported from OSS

Reviewed By: supriyar

Differential Revision: D26032532

fbshipit-source-id: a084fb4fd711ad52b2da1c6378cbcc2b352976c6
2021-01-25 12:49:58 -08:00
f6f0fde841 [reland][quant][graphmode][fx] Standalone module support {input/output}_quantized_idxs (#49754) (#50058)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50058

This PR adds the support for {input/output}_quantized_idxs for standalone module.

if input_quantized_idxs = [] and output_quantized_idxs = [], the standalone module will be expecting float
input and produce float output, and will quantize the input and dequantize output internally

if input_quantized_idxs = [0] and otuput_qiuantized_idxs = [0], the standalone module will be expecting quantized
input and produce quantized output, the input will be quantized in the parent module, and output will be dequantized
in the parent module as well, this is similar to current quantized modules like nn.quantized.Conv2d

For more details, please see the test case

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_standalone_module

Imported from OSS

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D25768910

fbshipit-source-id: 96c21a3456cf192c8f1400afa4e86273ee69197b
2021-01-05 20:27:46 -08:00
ea558b2135 fx quant: hook up ConvTranspose{n}d (#49717)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49717

Quantization of `ConvTranpose{n}d` is supported in Eager mode. This PR
adds the support for FX graph mode.

Note: this currenlty only works in `qnnpack` because per-channel weights
are not supported by quantized conv transpose. In a future PR we should throw
an error when someone tries to quantize a ConvTranspose model with per-channel
weight observers until this is fixed.

Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps.test_conv_transpose_1d
python test/test_quantization.py TestQuantizeFxOps.test_conv_transpose_2d
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D25674636

fbshipit-source-id: b6948156123ed55db77e6337bea10db956215ae6
2020-12-28 14:27:07 -08:00
46cf6d332f Revert D25684692: [quant][graphmode][fx] Standalone module support {input/output}_quantized_idxs
Test Plan: revert-hammer

Differential Revision:
D25684692 (89b4899ea5)

Original commit changeset: 900360e01c0e

fbshipit-source-id: 8b65fa8fbc7b364fbddb5f23cc696cd9b7db98cd
2020-12-24 15:50:52 -08:00