Commit Graph

23 Commits

Author SHA1 Message Date
c98896b76f [quant][pt2e] Add more precise representation for quantized add (#104130)
Summary:
The planned e2e for quantization in pytorch 2.0 export is the following:

float_model -> prepare_pt2e -> calibration -> convert_pt2e -> ...

inside convert_pt2e, we will first produce a q/dq representation of the quantized model, similar to the previous output of
convert_to_reference_fx in fx grah mode quantization:

```
torch.ops.quantized_decomposed.dequantize_per_tensor -> torch.ops.aten.add -> torch.ops.quantized_decomopsed.quantize_per_tensor
torch.ops.quantized_decomposed.dequantize_per_tensor   /
```

Then we'll rewrite the above to a more precise representation that express the intention in a more precise manner, since
here we actually want to do int8 addition, instead of simulating the int8 addition with fp32 operations, the representation for
quantized add is:

```
def quantized_add(x_i8, x_scale, x_zero_point, y_i8, y_scale, y_zero_point, out_scale, out_zero_point):
    x = (x_scale / out_scale) * x_i8
    y = (y_scale / out_scale) * y_i8
    out = x + y
    out -= (x_zero_point * x_scale - y_zero_point * y_scale) / out_scale
    out += out_zero_point
    return out
```

Test Plan:
```
buck2 test caffe2/test:quantization_pt2e -- --exact 'caffe2/test:quantization_pt2e - test_representation_add (quantization.pt2e.test_quantize_pt2e.TestQuantizePT2E)'
```

Reviewed By: kimishpatel

Differential Revision: D45628032

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104130
Approved by: https://github.com/kimishpatel
2023-06-27 20:11:30 +00:00
ce8d31551b [quant][be] Change return type for zero_point to be int32 Tensor (#102234)
Summary: This is probably a typo

Test Plan: CI

Reviewed By: salilsdesai

Differential Revision: D46172706

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102234
Approved by: https://github.com/salilsdesai
2023-06-01 18:30:44 +00:00
a13a63ae9a Fix typos under torch/ao directory (#97679)
This PR fixes typos in comments and messages of `.py` files under `torch/ao` directory

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97679
Approved by: https://github.com/janeyx99, https://github.com/kit1980
2023-04-10 22:25:15 +00:00
fc324d3485 [quant][pt2e] Add support for dynamic quantization with symmetric quant for input (#94854)
Summary:
Previously we assumed asymmetric quantization for dynamic quantization, this diff adds the support of symmetric quantization
for the input in dynamic quantization

Test Plan: buck run executorch/exir/tests:quant_lowering_custom_backend_pass -- "executorch.exir.tests.test_quant_lowering_custom_backend_pass.TestQuantLoweringCustomBackendPass.test_quantized_linear_dynamic"

Reviewed By: digantdesai

Differential Revision: D43134794

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94854
Approved by: https://github.com/digantdesai
2023-02-28 19:39:31 +00:00
641dc0b844 Revert "[quant] Add quantize and dequantize operators to decomposition table (#93312)"
This reverts commit 782e4f5c02abaf5b9cdba4eaa827bc70a310bca8.

Reverted https://github.com/pytorch/pytorch/pull/93312 on behalf of https://github.com/jeanschmidt due to this commits breaks internal builds: https://fburl.com/sandcastle/dw0rqcbv
2023-02-13 09:20:37 +00:00
2628901033 [Executorch][Quant] Add Choose_qparams_symmetric (#94685)
Summary: needed for symmetric dynamic quant flow

Test Plan: todo

Reviewed By: jerryzh168

Differential Revision: D43134117

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94685
Approved by: https://github.com/larryliu0820
2023-02-13 07:27:48 +00:00
782e4f5c02 [quant] Add quantize and dequantize operators to decomposition table (#93312)
Summary:
This PR tries to decompose the operators in torch.ops.quantized_decomposed namespace to more
primitive aten operators, this would free us from maintaining the semantics of the quantize/dequantize
operators, which can be expressed more precises in terms of underlying aten operators

Note: this PR just adds them to the decomposition table, we haven't enable this by default yet

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_q_dq_decomposition

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93312
Approved by: https://github.com/vkuzo, https://github.com/SherlockNoMad
2023-02-10 01:40:12 +00:00
bb48d90b00 [Executorch][Quant][BE] Refactor Choose_Qparams (#94338)
Summary: Refactor so that it can be decomposed

Test Plan: ci

Differential Revision: D42681268

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94338
Approved by: https://github.com/jerryzh168
2023-02-09 01:20:17 +00:00
3a5a762443 Revert "[quant] Add quantize and dequantize operators to decomposition table (#93312)"
This reverts commit 3fd46a2f9c56c692b242727cb146cfd464210c6a.

Reverted https://github.com/pytorch/pytorch/pull/93312 on behalf of https://github.com/huydhn due to Sorry for reverting your PR, but it breaks trunk due to a landrace 3fd46a2f9c.  Please rebase and re-land it
2023-02-08 18:29:10 +00:00
3fd46a2f9c [quant] Add quantize and dequantize operators to decomposition table (#93312)
Summary:
This PR tries to decompose the operators in torch.ops.quantized_decomposed namespace to more
primitive aten operators, this would free us from maintaining the semantics of the quantize/dequantize
operators, which can be expressed more precises in terms of underlying aten operators

Note: this PR just adds them to the decomposition table, we haven't enable this by default yet

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_q_dq_decomposition

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93312
Approved by: https://github.com/vkuzo, https://github.com/SherlockNoMad
2023-02-08 17:26:01 +00:00
c0dd9b3b67 Revert "[Executorch][Quantization][BE] Refactor Choose Qparams (#92592)"
This reverts commit 59071ab1e71891d480ab77af0d619bc5e01094c2.

It breaks `quantization.jit.test_ondevice_quantization.TestOnDeviceDynamicPTQFinalize`, which is not run in OSS, but is mandatory for internal CI.
2023-01-23 09:13:02 -08:00
59071ab1e7 [Executorch][Quantization][BE] Refactor Choose Qparams (#92592)
Summary: Should hopefully be a little faster. Definitely cleaner to not create an observer inside the op

Test Plan: ci

Differential Revision: D42154677

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92592
Approved by: https://github.com/jerryzh168
2023-01-20 01:36:47 +00:00
2a23dfe8ed [quant] Support lowering for quantized embedding byte operator (#91159)
Summary: This PR adds lowering for embedding in quantization in executorch flow

Test Plan: buck run executorch/exir/tests:quant_fusion_pass -- "executorch.exir.tests.test_quant_fusion_pass.TestQuantFusionPass.test_embedding_byte"

Reviewed By: qihqi

Differential Revision: D41673139

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91159
Approved by: https://github.com/vkuzo
2022-12-21 22:52:24 +00:00
bd94ee66ea [quantized] [executorch] typo (#89960)
Summary: Inefficient impl in python

Test Plan: buck2 test mode/dev //caffe2/test/quantization:test_quantization -- --exact 'caffe2/test/quantization:test_quantization - test_quantized_embedding_byte (caffe2.test.quantization.core.test_quantized_tensor.TestQuantizedTensor)'

Differential Revision: D41627744

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89960
Approved by: https://github.com/jerryzh168
2022-12-16 19:49:09 +00:00
94b9bb324f [quant] Add example for lowering quantized dynamic linear pattern through delegation (#90640)
Summary: Only the pattern part, will leave the delegation example to Chen

Test Plan: buck run executorch/exir/tests:quant_lowering_custom_backend_pass -- "executorch.exir.tests.test_quant_lowering_custom_backend_pass.TestQuantLoweringCustomBackendPass.test_quantized_linear_dynamic"

Reviewed By: cccclai

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90640
Approved by: https://github.com/cccclai
2022-12-13 00:57:33 +00:00
a747326423 Add manual meta implementations to quantize_per_tensor.tensor and co (#89958)
When you are writing a meta function, you cannot call item() on the tensor because there is no real data on the tensor and it will fail. The error message was not very good in this case, see also https://github.com/pytorch/pytorch/issues/89959

This PR takes a brute force approach to resolving the problem: just manually define meta implementations for the naughty functions that are calling item(). However, this results in a lot of code duplication. The easiest way to avoid this situation is to rewrite the decomps so they don't call item. It should not be that difficult to use direct tensors on your operations, as scalar tensors can broadcast too.

I could only test this with `buck test @mode/opt -c python.package_style=inplace //executorch/backends/test:test_backends` in internal with D41555454. Test coverage needs to be improved, otherwise don't blame us when we break you.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89958
Approved by: https://github.com/jerryzh168
2022-12-01 06:04:37 +00:00
9e4a25c731 [quant][decomposed] Add support for int32 for decomposed q/dq ops (#89881)
Summary:
att

Test Plan:
python test/test_quantization.py -k test_decomposed_quantize_per_tensor
python test/test_qunatization.py -k test_decomposed_dequantize_per_tensor

Reviewers:

Subscribers:

Tasks:

Tags:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89881
Approved by: https://github.com/cccclai
2022-11-30 21:24:00 +00:00
b7483be06a [quant][docs] Add docstrings for operators defined in torch.ops.quantized_decomposed namespace (#89547)
Summary:
no functionality changes

Test Plan:
NA

Reviewers:

Subscribers:

Tasks:

Tags:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89547
Approved by: https://github.com/vkuzo
2022-11-23 20:40:53 +00:00
29742786f3 [quant] Add dequantize_per_channel in quantized_decomposed op library (#89269)
Summary:
att

Test Plan:
python test/test_quantization.py -k test_decomposed_dequantize_per_channel

Reviewers:

Subscribers:

Tasks:

Tags:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89269
Approved by: https://github.com/vkuzo
2022-11-23 04:25:25 +00:00
391b593ca2 [quant] Add quantize_per_channel in quantized_decomposed op library (#89268)
Summary:
att

Test Plan:
python test/test_quantization.py -k test_decomposed_quantize_per_channel

Reviewers:

Subscribers:

Tasks:

Tags:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89268
Approved by: https://github.com/vkuzo
2022-11-22 22:40:11 +00:00
6f4f69f54d [Executorch] [Quantization] New pattern for dynamic dequant (#89236)
Summary: The op exposed should be qparams, and then we have concerns about prims not being supported so make q and dq ops that take in tensors

Test Plan: unit test

Differential Revision: D41382580

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89236
Approved by: https://github.com/jerryzh168
2022-11-18 04:13:05 +00:00
7f55db4fb0 add quantize_decomposed_dynamic to op lib (#88855)
Summary: Needed for dynamic quant reference pattern graphs.

Test Plan: added unittest

Differential Revision: D41205030

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88855
Approved by: https://github.com/jerryzh168
2022-11-16 16:59:36 +00:00
7ab6f56ca7 [quant][core] Add quantize/dequantize ops for decomposed quantized Tensor representation (#87093)
Summary:
Added q/dq implementation for out of core (decomposed) quantized Tensor representation, meaning that
instead of storing quantization parameters (e.g. scale/zero_point) in a separate quantized Tensor object, we will store
quantization parameters in the argument of operators.
```
quantize(float32_tensor, scale, zero_point, dtype) -> int8_tensor
dequantize(int8_tensor, scale, zero_point, dtype) -> float32_tensor
```

Test Plan:
python test/test_quantization.py TestQuantizedTensor.test_decomposed_quantize
python test/test_quantization.py TestQuantizedTensor.test_decomposed_dequantize

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87093
Approved by: https://github.com/dzdang, https://github.com/z-a-f
2022-10-25 23:50:41 +00:00