mirror of
https://github.com/pytorch/pytorch.git
synced 2025-10-21 05:34:18 +08:00
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75126 Quantization has a high volume of configurations of how to quantize an op for a reference model representation which is useful for a lowering step for a backend. An example of this is ``` {'dtype_configs': [{'input_dtype': torch.quint8, 'output_dtype': torch.quint8}], 'observation_type': <ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT: 0>, 'pattern': <class 'torch.nn.modules.conv.ConvTranspose1d'>}, ``` These configs are checked into master, and they are created with Python functions. Therefore, there is no easy way for the user to see what the configs actually are without running some Python code. This PR is one approach to document these configs. Here is what this is doing: 1. during documentation build, write a text file of the configs 2. render that text file on a quantization page, with some additional context In the future, this could be extended to autogenerate better looking tables such as: op support per backend and dtype, op support per valid quantization settings per backend, etc. Test Plan: ``` cd docs make html cd html python -m http.server 8000 // render http://[::]:8000/quantization-backend-configuration.html // it renders correctly ``` Reviewed By: ejguan Differential Revision: D35365461 Pulled By: vkuzo fbshipit-source-id: d60f776ccb57da9db3d09550e4b27bd5e725635a (cherry picked from commit 14865c0e23bc080120342c8f9278f0fae8eb8fbd)
21 lines
610 B
ReStructuredText
21 lines
610 B
ReStructuredText
Quantization Backend Configuration
|
|
----------------------------------
|
|
|
|
FX Graph Mode Quantization allows the user to configure various
|
|
quantization behaviors of an op in order to match the expectation
|
|
of their backend.
|
|
|
|
In the future, this document will contain a detailed spec of
|
|
these configurations.
|
|
|
|
|
|
Default values for native configurations
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Below is the output of the configuration for quantization of ops
|
|
in fbgemm and qnnpack (PyTorch's default quantized backends).
|
|
|
|
Results:
|
|
|
|
.. literalinclude:: scripts/quantization_backend_configs/default_backend_config.txt
|