Commit Graph

10 Commits

Author SHA1 Message Date
6d8c9be54b [reland] Add int1 to int7 dtypes (#137928)
Summary:
Similar to https://github.com/pytorch/pytorch/pull/117208, we want to add int1 to int7 for edge use cases
for weight quantization

Test Plan:
python test/test_quantization.py -k test_uint4_int4_dtype

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D64344944](https://our.internmc.facebook.com/intern/diff/D64344944)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/137928
Approved by: https://github.com/malfet
2024-10-18 02:02:08 +00:00
2ef1454189 Revert "Add int1 to int7 dtypes (#136301)"
This reverts commit bfa16a161d5089a9ba008f5e665f29b58dc16526.

Reverted https://github.com/pytorch/pytorch/pull/136301 on behalf of https://github.com/PaliC due to causing internal failures ([comment](https://github.com/pytorch/pytorch/pull/136301#issuecomment-2384119600))
2024-09-30 20:50:49 +00:00
bfa16a161d Add int1 to int7 dtypes (#136301)
Summary:
Similar to https://github.com/pytorch/pytorch/pull/117208, we want to add int1 to int7 for edge use cases
for weight quantization (https://www.internalfb.com/diff/D62464487)

Test Plan:
python test/test_quantization.py -k test_uint4_int4_dtype

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136301
Approved by: https://github.com/ezyang
2024-09-28 02:08:33 +00:00
221350e3a4 Add None return type to init -- tests (#132352)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132352
Approved by: https://github.com/ezyang
ghstack dependencies: #132335, #132351
2024-08-01 15:44:51 +00:00
3e397cefc5 Add uint1 to uint7 dtypes (#117208)
Summary:
These dtypes are added since we see more demand for these sub byte dtypes, especially with
the popularity of LLMs (https://pytorch.org/blog/accelerating-generative-ai-2/#step-4-reducing-the-size-of-the-weights-even-more-with-int4-quantization-and-gptq-2021-toks)

Note these are just placeholders, the operator support for these dtypes will be implemented with tensor subclass.
e.g. torch.empty(..., dtype=torch.uint1) will return a tensor subclass of uint1, that supports different operations like bitwsise ops, add, mul etc. (will be added later)

Also Note that these are not quantized data types, we'll implement quantization logic with tensor subclass backed up by these dtypes as well.
e.g `Int4GroupedQuantization(torch.Tensor)` will be implemented with torch.uint4 Tensors (see https://github.com/pytorch-labs/ao/pull/13 as an example)

Test Plan:
CIs
python test/test_quantization.py -k test_uint1_7_dtype

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117208
Approved by: https://github.com/ezyang
2024-01-13 01:09:23 +00:00
f15ab8a7f2 AO migration: replace torch internal callsites (#94170)
Summary:

Do the following renames:
`torch.quantization` -> `torch.ao.quantization`
`torch.nn.quantized` -> `torch.ao.nn.quantized`
`torch.nn.quantizable` -> `torch.ao.nn.quantizable`
`torch.nn.qat` -> `torch.ao.nn.qat`
`torch.nn.intrinsic` -> `torch.ao.nn.intrinsic`

And then, do
`torch.ao.nn.quantized._reference` -> `torch.ao.nn.quantized.reference` to clean up the aftermath of https://github.com/pytorch/pytorch/pull/84974

Then, manually update `test/test_module_init.py` to fix hanging whitespace due to the replace.

Run this script to do the replacements: https://gist.github.com/vkuzo/7f7afebf8c31b9ba48306223e68a1c82

This is for https://github.com/pytorch/pytorch/issues/81667

Test plan: CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94170
Approved by: https://github.com/jerryzh168
2023-02-07 02:32:23 +00:00
1f7153bee8 [quant] Optionally clamp weights post quantization (#83438)
Summary: Until we add quant_{min, max} args to `torch.quantize_per_{channel, tensor}`, this patch will make sure we will honor observer's restrictions on quantized values.

Test Plan: Added new tests, run with - `buck run caffe2/test:quantization -- quantization.core.test_utils`

Differential Revision: D38624119

Pull Request resolved: https://github.com/pytorch/pytorch/pull/83438
Approved by: https://github.com/andrewor14
2022-08-17 16:31:14 +00:00
7ea5fa3dd4 [reland][quant] Add utility function get_fqn_to_example_inputs
Summary:
After https://github.com/pytorch/pytorch/pull/77608 `example_inputs` is required input for `prepare_fx` and `prepare_qat_fx`.
This makes quantizing submodules harder, so we added this utility function to get a dictionary from fqn to submodule example_inputs

Example Call:

```
example_inputs = (tensor0,)
get_fqn_to_example_inputs(m, example_inputs)
```

Example output:
```
{
   "linear1": (tensor1,),
   "linear2": (tensor2,),
   "sub": (tensor3,),
   "sub.linear1": (tensor4,),
   ...
}
```

Test Plan:
python test/test_quantization.py TestUtils

Reviewers:

Subscribers:

Tasks:

Tags:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78286

Approved by: https://github.com/dzdang
2022-05-25 23:31:51 +00:00
87148f2b59 Revert "[quant] Add utility function get_fqn_to_example_inputs"
This reverts commit 50a44fe461d5026e0aa69b95d7dc6e87d07cf3c7.

Reverted https://github.com/pytorch/pytorch/pull/78146 on behalf of https://github.com/suo due to as it broke master
2022-05-25 06:37:32 +00:00
50a44fe461 [quant] Add utility function get_fqn_to_example_inputs
Summary:
After https://github.com/pytorch/pytorch/pull/77608 `example_inputs` is required input for `prepare_fx` and `prepare_qat_fx`.
This makes quantizing submodules harder, so we added this utility function to get a dictionary from fqn to submodule example_inputs

Example Call:

```
example_inputs = (tensor0,)
get_fqn_to_example_inputs(m, example_inputs)
```

Example output:
```
{
   "linear1": (tensor1,),
   "linear2": (tensor2,),
   "sub": (tensor3,),
   "sub.linear1": (tensor4,),
   ...
}
```

Test Plan:
python test/test_quantization.py TestUtils

Reviewers:

Subscribers:

Tasks:

Tags:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78146

Approved by: https://github.com/vkuzo
2022-05-25 03:07:16 +00:00