Summary:
This is a follow up to https://github.com/pytorch/pytorch/pull/118605 to remove `fold_quantize` flag from
`convert_pt2e`
Test Plan: CI
Differential Revision: D53247301
BC Breaking Note:
flag `fold_quantize` set to True `convert_pt2e` and now we'll fold the quantize op in the weight by default, so users will see model size reduction by default after pt2e quantization.
2.2
```
folded_model = convert_pt2e(model, fold_quantize=True)
non_folded_model = convert_pt2e(model)
```
2.3
```
folded_model = convert_pt2e(model)
non_folded_model = convert_pt2e(model, fold_quantize=False)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118701
Approved by: https://github.com/andrewor14, https://github.com/leslie-fang-intel
This is a lot of files changed! Don't panic! Here's how it works:
* Previously, we set `follow_imports = silent` for our mypy.ini configuration. Per https://mypy.readthedocs.io/en/stable/running_mypy.html#follow-imports, what this does is whenever we have an import to a module which is not listed as a file to be typechecked in mypy, we typecheck it as normal but suppress all errors that occurred in that file.
* When mypy is run inside lintrunner, the list of files is precisely the files covered by the glob in lintrunner.toml, but with files in excludes excluded.
* The top-level directive `# mypy: ignore-errors` instructs mypy to typecheck the file as normal, but ignore all errors.
* Therefore, it should be equivalent to set `follow_imports = normal`, if we put `# mypy: ignore-errors` on all files that were previously excluded from the file list.
* Having done this, we can remove the exclude list from .lintrunner.toml, since excluding a file from typechecking is baked into the files themselves.
* torch/_dynamo and torch/_inductor were previously in the exclude list, because they were covered by MYPYINDUCTOR. It is not OK to mark these as `# mypy: ignore-errors` as this will impede typechecking on the alternate configuration. So they are temporarily being checked twice, but I am suppressing the errors in these files as the configurations are not quite the same. I plan to unify the configurations so this is only a temporary state.
* There were some straggler type errors after these changes somehow, so I fixed them as needed. There weren't that many.
In the future, to start type checking a file, just remove the ignore-errors directive from the top of the file.
The codemod was done with this script authored by GPT-4:
```
import glob
exclude_patterns = [
...
]
for pattern in exclude_patterns:
for filepath in glob.glob(pattern, recursive=True):
if filepath.endswith('.py'):
with open(filepath, 'r+') as f:
content = f.read()
f.seek(0, 0)
f.write('# mypy: ignore-errors\n\n' + content)
```
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118414
Approved by: https://github.com/thiagocrepaldi, https://github.com/albanD
Summary: fixed an import problem for test_xnnpack_quantizer so that it can run in CI
Test Plan:
internal CI
sanity check: buck2 test 'fbcode//mode/dev-nosan' fbcode//caffe2/test/quantization:test_quantization -- --exact 'caffe2/test/quantization:test_quantization - test_conv2d (caffe2.test.quantization.pt2e.test_xnnpack_quantizer.TestXNNPACKQuantizer)'
Differential Revision: D52576449
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116911
Approved by: https://github.com/mcr229
Summary:
This is a util for numeric suite in pt2 export so that we can build
a more streamlined UX for numerical debugging in quant + executorch stack
Test Plan:
python test/test_quantization.py TestGenerateNumericDebugHandle
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114315
Approved by: https://github.com/zhxchen17
Summary:
For a Node: node1 and edge: (node1, node2), since they are observing the same
Tensor, we may want to implicitly share observers, this flag allows people to
turn off this behavior for the output of the node
See the test_allow_implicit_sharing test for use case
Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_allow_implicit_sharing
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112929
Approved by: https://github.com/kimishpatel
Summary:
Previously we actually did not really support this, this PR added the support.
Next
* clean up insert observer logic
* add allow_transitive_sharing boolean flag to allow people to turn this op for certain edges
Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_shared_qspec_transitivity
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: [D50250789](https://our.internmc.facebook.com/intern/diff/D50250789)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111172
Approved by: https://github.com/kimishpatel
Summary:
Also added annotation support for conv1d_relu and conv1d in XNNPACKQuantizer, the quantized results still
matches fx quant path (didn't quantize conv1d) so tests are not disabled
Test Plan: with-proxy buck2 run executorch/examples/quantization:example -- -m=w2l --verify
Differential Revision: D49479546
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109830
Approved by: https://github.com/kimishpatel
Summary:
Add the test to make sure we can call the quantize API multiple times
Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_reentrant
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110125
Approved by: https://github.com/kimishpatel
ghstack dependencies: #110097
**Summary**
Enable quantization and lowering of `ConvTranspose3d`.
Add test cases for `ConvTranspose1d`, `ConvTranspose2d` and `ConvTranspose3d` since there were no such test cases.
**Test plan**
python test/test_quantization.py -k test_conv_transpose_not_reference
python test/test_quantization.py -k test_conv_transpose_reference
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97125
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
Summary:
Changes the PNP test cases to use QNNPACK. The only reason is because
I'm switching to Mac M1 as my primary machine, which supports QNNPACK
but not fbgemm, and it's convenient for me to be able to run these
locally.
PNP itself is not backend specific, so it does not matter which backend
the functionality is tested on.
Test plan:
```
python test/test_quantization.py -k NShadows
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91421
Approved by: https://github.com/jerryzh168
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `ConvAdd2d` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown.
**Test plan**
```
python -m pytest test_quantization.py -k test_conv2d_add
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91152
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `linear-leaky_relu` op for `onednn` backend, which will be used for int8 inference with `onednn` backend. Cannot call this op with other quantization backends otherwise an error is thrown.
**Test Plan**
python test_quantization.py TestQuantizedLinear
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88478
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.
The list of the `nn.quantized` files that are being migrated:
- [ ] `torch.nn.quantized` → `torch.ao.nn.quantized`
- [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
- [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
- [X] [Current PR] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
- [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
- [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
- [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
- [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
- [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
- [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
- [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
- [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`
Majority of the files are just moved to the new location.
However, specific files need to be double checked:
- [Documentation](docs/source/quantization-support.rst) @vkuzo
- [Public API test list](test/allowlist_for_publicAPI.json) @peterbell10
- [BC test](test/quantization/bc/test_backward_compatibility.py) @vkuzo
- [IR emitter](torch/csrc/jit/frontend/ir_emitter.cpp) @jamesr66a
- [JIT serialization](torch/csrc/jit/serialization/import_source.cpp) @IvanKobzarev @jamesr66a
Differential Revision: [D36860660](https://our.internmc.facebook.com/intern/diff/D36860660/)
**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36860660/)!
Differential Revision: [D36860660](https://our.internmc.facebook.com/intern/diff/D36860660)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78714
Approved by: https://github.com/jerryzh168
Fix use-dict-literal pylint suggestions by changing `dict()` to `{}`. This PR should do the change for every Python file except test/jit/test_list_dict.py, where I think the intent is to test the constructor.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83718
Approved by: https://github.com/albanD
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.
The list of the `nn.quantized` files that are being migrated:
- [ ] `torch.nn.quantized` → `torch.ao.nn.quantized`
- [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
- [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
- [X] [Current PR] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
- [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
- [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
- [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
- [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
- [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
- [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
- [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
- [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`
Majority of the files are just moved to the new location.
However, specific files need to be double checked:
- [Documentation](docs/source/quantization-support.rst) @vkuzo
- [Public API test list](test/allowlist_for_publicAPI.json) @peterbell10
- [BC test](test/quantization/bc/test_backward_compatibility.py) @vkuzo
- [IR emitter](torch/csrc/jit/frontend/ir_emitter.cpp) @jamesr66a
- [JIT serialization](torch/csrc/jit/serialization/import_source.cpp) @IvanKobzarev @jamesr66a
Differential Revision: [D36860660](https://our.internmc.facebook.com/intern/diff/D36860660/)
**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36860660/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78714
Approved by: https://github.com/jerryzh168
Summary: This PR removes the is_reference flag from the existing
convert_fx API and replaces it with a new convert_to_reference
function. This separates (1) converting the prepared model to a
reference model from (2) lowering the reference model to a quantized
model, enabling users to call their custom lowering function for
custom backends. For the native fbgemm backend, for example, the
following are equivalent:
```
from torch.ao.quantization.quantize_fx import prepare_fx, convert_fx
prepared = prepare_fx(model, ...)
quantized = convert_fx(prepared, ...)
```
```
from torch.ao.quantization.fx import lower_to_fbgemm
from torch.ao.quantization.quantize_fx import (
prepare_fx,
convert_to_reference
)
prepared = prepare_fx(model, ...)
reference = convert_to_reference(prepared, ...)
quantized = lower_to_fbgemm(reference, ...)
```
Note that currently `lower_to_fbgemm` takes in two other arguments
that are difficult for users to provide. A future commit will remove
these arguments to make the helper function more user friendly.
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Reviewers: jerryzh168, vkuzo
Subscribers: jerryzh168, vkuzo
Differential Revision: [D37359946](https://our.internmc.facebook.com/intern/diff/D37359946)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80091
Approved by: https://github.com/jerryzh168
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79066
Following https://github.com/pytorch/pytorch/pull/78452,
this commit replaces the following config dicts with python objects:
- prepare_custom_config_dict -> PrepareCustomConfig
- convert_custom_config_dict -> ConvertCustomConfig
- fuse_custom_config_dict -> FuseCustomConfig
This leads to better type safety and better user experience in
notebook settings due to improved auto completion. The new APIs
are as follows:
```
from torch.ao.quantization.fx.custom_config import PrepareCustomConfig
prepare_custom_config = PrepareCustomConfig() \
.set_float_to_observed_mapping(float_class, observed_class) \
.set_non_traceable_module_names(["mod1", "mod2"]) \
.set_non_traceable_module_classes([class1, class2]) \
.set_input_quantized_indexes([0, 1]) \
.set_output_quantized_indexes([0]) \
.set_preserved_attributes(["attr1", "attr2"])
convert_custom_config = ConvertCustomConfig() \
.set_observed_to_quantized_mapping(observed_class, quantized_class) \
.set_preserved_attributes(["attr1", "attr2"])
model = prepare_fx(
model,
qconfig_mapping,
example_inputs,
prepare_custom_config=prepare_custom_config)
model(data)
model = convert_fx(model, convert_custom_config=convert_custom_config)
```
For backwards compatibility, prepare_fx, prepare_qat_fx, and
convert_fx will continue to accept Dicts, which will be converted
to the relevant *CustomConfig object internally.
Note that this commit does not modify existing tests to use the
new API; they will continue to pass in Dicts as before, which still
works but triggers a deprecation warning. This will be handled in
a future commit.
Differential Revision: [D37088095](https://our.internmc.facebook.com/intern/diff/D37088095/)
Approved by: https://github.com/jerryzh168