Summary:
This is a util for numeric suite in pt2 export so that we can build
a more streamlined UX for numerical debugging in quant + executorch stack
Test Plan:
python test/test_quantization.py TestGenerateNumericDebugHandle
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114315
Approved by: https://github.com/zhxchen17
Summary: Previously the PT2 QAT code only supported conv2d-bn.
This commit extends all existing QAT fusion support to conv1d-bn,
including support for all variants like relu, no bias, literal
args, cuda etc. This commit also refactors the code such that
we can support conv3d-bn easily in the future.
Test Plan:
python test/test_quantization.py TestQuantizePT2EQAT_ConvBn1d
Reviewers: jerryzh168, kimishpatel
Subscribers: jerryzh168, kimishpatel, supriyar
Differential Revision: [](https://our.internmc.facebook.com/intern/diff/)
Differential Revision: [D51428979](https://our.internmc.facebook.com/intern/diff/D51428979)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113714
Approved by: https://github.com/jerryzh168
Summary: Currently the QAT tests are very specific to conv-bn-2d.
This makes it difficult to test new patterns like conv-bn-1d if
we want to add them. This commit refactors these tests so we can
add and test future patterns easily.
Test Plan:
python test/test_quantization.py TestQuantizePT2EQAT_ConvBn2d
Reviewers: jerryzh168, kimishpatel
Subscribers: jerryzh168, kimishpatel, supriyar
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113658
Approved by: https://github.com/jerryzh168
Summary:
Ensures that creating tensors, copying, filling with zeroes, checking for nan works on cuda for the `float8` dtypes. This should be enough for float8 emulation on cuda.
Note that I skipped the mul test - it's less trivial to add (need a new c++ macro), and there is no use case for it. We can follow up on that in the future.
Test Plan:
```
python test/test_quantization.py TestFloat8Dtype
```
Reviewers:
Subscribers:
Tasks:
Tags:
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105807
Approved by: https://github.com/ezyang, https://github.com/jerryzh168, https://github.com/albanD
Summary:
att, we use module partition API to identify the GRU submodule and annotate all necessary patterns
Test Plan: buck2 test mode/opt caffe2/test:quantization_pt2e -- 'caffe2/test:quantization_pt2e'
Differential Revision: D46689428
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103526
Approved by: https://github.com/andrewor14
Summary: att, we use module partition API to identify the GRU submodule and annotate all necessary patterns
Test Plan: buck2 test mode/opt caffe2/test:quantization_pt2e -- 'caffe2/test:quantization_pt2e'
Reviewed By: kimishpatel
Differential Revision: D46384329
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103358
Approved by: https://github.com/HDCharles
Summary:
We are currently silently skipping all PT2 quantization
tests due to a recent typo. This commit fixes this and also adds
warnings so it'll be easier to debug similar issues in the future.
Test Plan: python test/test_quantization.py
Differential Revision: D46383546
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102819
Approved by: https://github.com/jerryzh168
Summary:
We are currently silently skipping all PT2 quantization
tests due to a recent typo. This commit fixes this and also adds
warnings so it'll be easier to debug similar issues in the future.
Test Plan: python test/test_quantization.py
Differential Revision: D46329480
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102644
Approved by: https://github.com/jerryzh168
This diff introduces utility `find_sequential_partitions`.
This utility allows one to specify sequential pattern of
nn.Module/nn.functional and returns a list. Each item in the list contains a
List[SourcePartition] that represents sequentially connected partitions that
are of the pattern requested.
For example `find_sequential_partitions(model, [nn.Conv2d, nn.ReLU])` will find
all nn.Conv2d and nn.ReLU partitions that are sequentially connected.
Furthmore, move to using `find_sequential_partitions` for conv_bn/conv_bn_relu
for QAT.
Differential Revision: [D45948057](https://our.internmc.facebook.com/intern/diff/D45948057/)
**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D45948057/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102394
Approved by: https://github.com/jerryzh168
**Summary**
After https://github.com/pytorch/pytorch/pull/99064 and https://github.com/pytorch/pytorch/pull/99065 merged, the pt2e UT path has changed, also need to change the module path in `test/test_quantization.py`. Then we can run these tests in top level's test directory.
**Test Plan**
```
cd test && python -u -m pytest test_quantization.py -k TestQuantizePT2E
cd test && python -u -m pytest test_quantization.py -k TestQuantizePT2EModels
cd test && python -u -m pytest test_quantization.py -k TestQuantizePT2EFX
cd test && python -u -m pytest test_quantization.py -k TestQuantizePT2EFXX86Inductor
cd test && python -u -m pytest test_quantization.py -k TestQuantizePT2EFXModels
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99402
Approved by: https://github.com/jerryzh168
Summary:
This is a retry of https://github.com/pytorch/pytorch/pull/94992 which was reverted due to CI issues.
This PR adds a set of unintrepreted data types on PyTorch which can be used to implement experimental functionality out of core (think fp8, int4, int16 quant, etc).
@bypass-github-export-checks
Test Plan:
```
python test/test_quantization.py -k TestBits
```
Reviewers:
Subscribers:
Tasks:
Tags:
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95860
Approved by: https://github.com/atalman
Summary:
This PR adds a set of unintrepreted data types on PyTorch which can be used to implement experimental functionality out of core (think fp8, int4, int16 quant, etc).
Note: this is a copy-pasta of https://github.com/pytorch/pytorch/pull/89990 with a bug fix for clang9, easier to just to put up another PR since I'm not sure how comandeering works with Meta-only changes.
@bypass-github-export-checks
Test Plan:
```
python test/test_quantization.py -k TestBits
```
Reviewers:
Subscribers:
Tasks:
Tags:
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94992
Approved by: https://github.com/angelayi
Summary:
This PR introduces the top level APIs for quantization support in PyTorch 2.0 Export stack
* torch.ao.quantization.quantize_pt2e.prepare_pt2e
Takes a model that is captured by the PyTorch 2.0 export (torchdynamo full graph mode) and prepares the model for calibration
for post training quantization
* torch.ao.quantization.quantize_pt2e.convert_pt2e
Takes a calibrated model and converts that to a reference quantized model that can be lowered later to quantized operator libraries or delegation modules
Also added a backend config for the qnnpack_pt2e backend:
* torch.ao.quantization.backend_config.get_qnnpack_pt2e_backend_config
Note: everything related to quantize_pt2e are experimental (prototype), and we don't have any bc guarantees
Test Plan:
python test/test_quantization.py TestQuantizePT2EModels
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91035
Approved by: https://github.com/HDCharles
Summary:
This PR introduces the top level APIs for quantization support in PyTorch 2.0 Export stack
* torch.ao.quantization.quantize_pt2e.prepare_pt2e
Takes a model that is captured by the PyTorch 2.0 export (torchdynamo full graph mode) and prepares the model for calibration
for post training quantization
* torch.ao.quantization.quantize_pt2e.convert_pt2e
Takes a calibrated model and converts that to a reference quantized model that can be lowered later to quantized operator libraries or delegation modules
Also added a backend config for the qnnpack_pt2e backend:
* torch.ao.quantization.backend_config.get_qnnpack_pt2e_backend_config
Note: everything related to quantize_pt2e are experimental (prototype), and we don't have any bc guarantees
Test Plan:
python test/test_quantization.py TestQuantizePT2EModels
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90971
Approved by: https://github.com/HDCharles
Summary:
This PR introduces the top level APIs for quantization support in PyTorch 2.0 Export stack
* torch.ao.quantization.quantize_pt2e.prepare_pt2e
Takes a model that is captured by the PyTorch 2.0 export (torchdynamo full graph mode) and prepares the model for calibration
for post training quantization
* torch.ao.quantization.quantize_pt2e.convert_pt2e
Takes a calibrated model and converts that to a reference quantized model that can be lowered later to quantized operator libraries or delegation modules
Also added a backend config for the qnnpack_pt2e backend:
* torch.ao.quantization.backend_config.get_qnnpack_pt2e_backend_config
Note: everything related to quantize_pt2e are experimental (prototype), and we don't have any bc guarantees
Test Plan:
python test/test_quantization.py TestQuantizePT2EModels
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90802
Approved by: https://github.com/qihqi
Summary:
This PR is an early prototype of a tool to quantize each layer of a model
N times, with N qconfigs each. We follow the design agreed upon in
https://fburl.com/gdoc/e1gaq3ih .
Current API:
```
m = M().eval()
example_input = (torch.randn(2, 2),)
qconfig_mappings = [
QConfigMapping().set_global(torch.quantization.default_qconfig),
QConfigMapping().set_global(torch.quantization.default_dynamic_qconfig),
]
backend_config = get_native_backend_config()
msp = prepare_n_shadows_model(
m, example_input, qconfig_mappings, backend_config)
for _ in range(2):
msp(*example_input)
msq = convert_n_shadows_model(msp)
msq(*example_input)
results = extract_results_n_shadows_model(msq)
print_comparisons_n_shadows_model(results)
// example output
subgraph_idx ref_node_name best_idx 1 2
-------------- --------------- ---------- ------- -------
subgraph_0 fc1 2 42.0834 42.6279
subgraph_1 fc2 2 43.7259 50.0593
```
Test plan:
```
python test/test_quantization.py -k test_n_shadows
```
Differential Revision: [D37650332](https://our.internmc.facebook.com/intern/diff/D37650332)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80521
Approved by: https://github.com/jerryzh168, https://github.com/andrewor14
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.
The list of the `nn.quantized` files that are being migrated:
- [ ] `torch.nn.quantized` → `torch.ao.nn.quantized`
- [X] [Current PR] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
- [ ] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
- [ ] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
- [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
- [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
- [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
- [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
- [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
- [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
- [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
- [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`
Majority of the files are just moved to the new location.
However, specific files need to be double checked:
- [Documentation](docs/source/quantization-support.rst) @vkuzo
- [Public API test list](test/allowlist_for_publicAPI.json) @peterbell10
Differential Revision: [D36792967](https://our.internmc.facebook.com/intern/diff/D36792967/)
**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36792967/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78712
Approved by: https://github.com/jerryzh168
Summary: Following https://github.com/pytorch/pytorch/pull/78452
and https://github.com/pytorch/pytorch/pull/79066, this commit
is part 1 of the broader effort to replace `backend_config_dict`
with a python config object, a more formal and robust API that
leads to better user experience. Note that there is no change in
behavior in this commit by itself. A future commit (part 2) will
replace all existing usages of `backend_config_dict` with the
`BackendConfig` object added in this commit.
Test Plan:
python test/test_quantization.py TestBackendConfig
Reviewers: jerryzh168
Subscribers: jerryzh168
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81469
Approved by: https://github.com/jerryzh168
Summary: This introduces the skeleton for the ModelReportVisualizer
class. This class helps visualize the information generated by the
ModelReport class `generate_report()` output. This class aims to provide
visualizations in a table, plot (line graph) and histogram view.
This also introduces an empty test class for testing visualizations. As
implementations start occuring for this class, tests will also be
approrpriately added.
This includes the high level descriptions for each of the methods as
well. Expected use cases will be added to the class description in a
future commit as that gets finalized.
Test Plan: python test/test_quantization.py TestFxModelReportVisualizer
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81523
Approved by: https://github.com/andrewor14
Summary: This adds the class framework for the ModelReport
OutlierDetector. This detector will be in charge of looking at
activation data and figuring out whether there are significant oultiers
present in them. It will average this data across batches to make a
recommendation / warning if significant outliers are found.
This commit contains just the class framework and a base test class.
Implementations will follow in following commits.
Test Plan: python test/test_quantization.py TestFxDetectOutliers
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80743
Approved by: https://github.com/HDCharles
Summary: This adds the framework (method signatures and descriptors) for
the InputWeightEqualization Detector. There is no code implemenation yet
so the test suite for this is a simple pass. This Detector will be used
to determine whether input weight equalization should be recommended.
Test Plan: python test/test_quantization.py TestFxDetectInputWeightEqualization
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79916
Approved by: https://github.com/HDCharles
Summary: per https://github.com/pytorch/pytorch/issues/79135 the code
snippets in the docs don't run. This is a recurring problem since
previously there was no unit test to check that these code snippets
actually ran. This PR adds support for such a test, importing the
snippet as a string and evaluating it to make sure that it actually runs
if the code snippet has user defined code, you can pass in dummy
versions using global_inputs. Sometimes the imports of the code snippets
behave oddly but you can pass them in as in test_quantization_doc_custom
where nnq is passed in.
Test Plan: python test/test_quantization.py TestQuantizationDocs
also see https://github.com/pytorch/pytorch/pull/79994 to see what shows up in CI when the docs get broken
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79923
Approved by: https://github.com/z-a-f, https://github.com/vspenubarthi
Summary: The ModelReport class in model_report.py combines the
functionality of the detectors and the ModelReportObserver. It creates
an end-to-end system where a user can pass in a prepared Graph Model to
insert the ModelReportObservers, then after the user callibrates their
model, the callibrated model can then be used by the ModelReport class
to generate reports based on what the user wished to gather information
on.
This contains the init method and the signatures and docs for each
of the proposed helper functions.
This also address and fixes a revert issue.
Test Plan: python test/test_quantization.py TestFxModelReportClass
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80052
Approved by: https://github.com/HDCharles