Summary:
These dtypes are added since we see more demand for these sub byte dtypes, especially with
the popularity of LLMs (https://pytorch.org/blog/accelerating-generative-ai-2/#step-4-reducing-the-size-of-the-weights-even-more-with-int4-quantization-and-gptq-2021-toks)
Note these are just placeholders, the operator support for these dtypes will be implemented with tensor subclass.
e.g. torch.empty(..., dtype=torch.uint1) will return a tensor subclass of uint1, that supports different operations like bitwsise ops, add, mul etc. (will be added later)
Also Note that these are not quantized data types, we'll implement quantization logic with tensor subclass backed up by these dtypes as well.
e.g `Int4GroupedQuantization(torch.Tensor)` will be implemented with torch.uint4 Tensors (see https://github.com/pytorch-labs/ao/pull/13 as an example)
Test Plan:
CIs
python test/test_quantization.py -k test_uint1_7_dtype
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117208
Approved by: https://github.com/ezyang
Summary: Until we add quant_{min, max} args to `torch.quantize_per_{channel, tensor}`, this patch will make sure we will honor observer's restrictions on quantized values.
Test Plan: Added new tests, run with - `buck run caffe2/test:quantization -- quantization.core.test_utils`
Differential Revision: D38624119
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83438
Approved by: https://github.com/andrewor14
Summary:
After https://github.com/pytorch/pytorch/pull/77608 `example_inputs` is required input for `prepare_fx` and `prepare_qat_fx`.
This makes quantizing submodules harder, so we added this utility function to get a dictionary from fqn to submodule example_inputs
Example Call:
```
example_inputs = (tensor0,)
get_fqn_to_example_inputs(m, example_inputs)
```
Example output:
```
{
"linear1": (tensor1,),
"linear2": (tensor2,),
"sub": (tensor3,),
"sub.linear1": (tensor4,),
...
}
```
Test Plan:
python test/test_quantization.py TestUtils
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78286
Approved by: https://github.com/dzdang
Summary:
After https://github.com/pytorch/pytorch/pull/77608 `example_inputs` is required input for `prepare_fx` and `prepare_qat_fx`.
This makes quantizing submodules harder, so we added this utility function to get a dictionary from fqn to submodule example_inputs
Example Call:
```
example_inputs = (tensor0,)
get_fqn_to_example_inputs(m, example_inputs)
```
Example output:
```
{
"linear1": (tensor1,),
"linear2": (tensor2,),
"sub": (tensor3,),
"sub.linear1": (tensor4,),
...
}
```
Test Plan:
python test/test_quantization.py TestUtils
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78146
Approved by: https://github.com/vkuzo