This is follow-up of #165037. It generally recommended to use `is/is not` to compare types. Therefore this series of changes apply this suggestion in the code base, and it aims to finally enabling related linter checks.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165142
Approved by: https://github.com/albanD
It generally recommended to use `is/is not` to compare types. Therefore this series of changes apply this suggestion in the code base, and it aims to finally enabling related linter checks.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165037
Approved by: https://github.com/mlazos
Summary:
Continuing the work from https://github.com/pytorch/pytorch/pull/146427
Adds the `torch.float8_e8m0fnu` dtype to PyTorch, as detailed in
https://github.com/pytorch/pytorch/issues/146414 . Please see the issue for a detailed definition of the format. Example of basic functionality:
```python
import torch
# round trip
x0 = torch.randn(4, 4, dtype=torch.float32)
x1 = x0.to(torch.float8_e8m0fnu) # RNE rounding
x2 = x1.to(torch.float32) # 2 ** exponent
# creation with empty
x0 = torch.empty(4, 4, dtype=torch.float8_e8m0fnu)
# printing
print(x0)
```
Done in this PR:
* numerical correctness
* op coverage (except for `torch._scaled_mm`): create tensor, cast to/from float32
* printing a tensor works
For future PRs:
* performance optimizations for casting
* torch._scaled_mm
* PT2
* various cleanups (detailed in comments with issue numbers)
Test Plan:
```
pytest test/quantization/core/experimental/test_float8.py -s
```
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147466
Approved by: https://github.com/drisspg
Based on https://github.com/pytorch/pytorch/pull/126376, this PR tries to update all PT callers (e.g., `Tensor.is_pinned()`, `Tensor.pin_memory()`) to not pass `device` argument.
As for `storage/untyped_storage.is_pinned()/pin_memory()`, we keep the `device` argument but passing `device` is discouraged. And if not given, the default `device` is still 'cuda' for BC.
Additionally, based on device-agnostic pin_memory, `pin_memory_device` argument of `torch.utils.data.DataLoader` is discouraged now. For BC, explictly passing this argument is still effective. If not given, the default `device` will be the current accelerator.
Fixes#124908
Relates https://github.com/pytorch/pytorch/pull/126376
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131858
Approved by: https://github.com/albanD
Co-authored-by: albanD <desmaison.alban@gmail.com>
## Semantic
The semantic is
(1) By default `torch.serialization.skip_data(materialize_fake_tensors=False)` will make `torch.save` skip writing storages (but reserve space for them in the checkpoint).
```python
import torch
import torch.nn as nn
sd = nn.Linear(3, 5).state_dict()
with torch.serialization.skip_data():
torch.save(sd, 'foo.pt')
print(torch.load('foo.pt', weights_only=True))
```
(2) With `torch.serialization.skip_data(materialize_fake_tensors=True)`If FakeTensor is passed to `torch.save` the pickler will treat these FakeTensors as being "materialized" space will be reserved in the checkpoint for the associated storage bytes, and when loading the type will be Tensor instead of FakeTensor)
```python
import torch
import torch.nn as nn
from torch._subclasses.fake_tensor import FakeTensorMode
with FakeTensorMode():
m = nn.Linear(3, 5, dtype=torch.float16, device='cuda')
sd = m.state_dict()
with torch.serialization.skip_data(materialize_fake_tensors=True):
torch.save(sd, 'bla.pt')
print(torch.load('bla.pt', weights_only=True))
# OrderedDict([('weight', tensor([[0., 0., 0.],
# [0., 0., 0.],
# [0., 0., 0.],
# [0., 0., 0.],
# [0., 0., 0.]], device='cuda:0', dtype=torch.float16)), ('bias', tensor([0., 0., 0., 0., 0.], device='cuda:0', dtype=torch.float16))])
```
## Follow Ups
- [ ] `torch.load` semantic for skip_data context manager
- [ ] Mechanism for getting offsets of storages saved via this method (for writing in a separate pass)
Differential Revision: [D62238610](https://our.internmc.facebook.com/intern/diff/D62238610)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134504
Approved by: https://github.com/albanD
## Semantic
The semantic is
(1) By default `torch.serialization.skip_data(materialize_fake_tensors=False)` will make `torch.save` skip writing storages (but reserve space for them in the checkpoint).
```python
import torch
import torch.nn as nn
sd = nn.Linear(3, 5).state_dict()
with torch.serialization.skip_data():
torch.save(sd, 'foo.pt')
print(torch.load('foo.pt', weights_only=True))
```
(2) With `torch.serialization.skip_data(materialize_fake_tensors=True)`If FakeTensor is passed to `torch.save` the pickler will treat these FakeTensors as being "materialized" space will be reserved in the checkpoint for the associated storage bytes, and when loading the type will be Tensor instead of FakeTensor)
```python
import torch
import torch.nn as nn
from torch._subclasses.fake_tensor import FakeTensorMode
with FakeTensorMode():
m = nn.Linear(3, 5, dtype=torch.float16, device='cuda')
sd = m.state_dict()
with torch.serialization.skip_data(materialize_fake_tensors=True):
torch.save(sd, 'bla.pt')
print(torch.load('bla.pt', weights_only=True))
# OrderedDict([('weight', tensor([[0., 0., 0.],
# [0., 0., 0.],
# [0., 0., 0.],
# [0., 0., 0.],
# [0., 0., 0.]], device='cuda:0', dtype=torch.float16)), ('bias', tensor([0., 0., 0., 0., 0.], device='cuda:0', dtype=torch.float16))])
```
## Follow Ups
- [ ] `torch.load` semantic for skip_data context manager
- [ ] Mechanism for getting offsets of storages saved via this method (for writing in a separate pass)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134504
Approved by: https://github.com/albanD
## Semantic
The semantic is
(1) By default `torch.serialization.skip_data(materialize_fake_tensors=False)` will make `torch.save` skip writing storages (but reserve space for them in the checkpoint).
```python
import torch
import torch.nn as nn
sd = nn.Linear(3, 5).state_dict()
with torch.serialization.skip_data():
torch.save(sd, 'foo.pt')
print(torch.load('foo.pt', weights_only=True))
```
(2) With `torch.serialization.skip_data(materialize_fake_tensors=True)`If FakeTensor is passed to `torch.save` the pickler will treat these FakeTensors as being "materialized" space will be reserved in the checkpoint for the associated storage bytes, and when loading the type will be Tensor instead of FakeTensor)
```python
import torch
import torch.nn as nn
from torch._subclasses.fake_tensor import FakeTensorMode
with FakeTensorMode():
m = nn.Linear(3, 5, dtype=torch.float16, device='cuda')
sd = m.state_dict()
with torch.serialization.skip_data(materialize_fake_tensors=True):
torch.save(sd, 'bla.pt')
print(torch.load('bla.pt', weights_only=True))
# OrderedDict([('weight', tensor([[0., 0., 0.],
# [0., 0., 0.],
# [0., 0., 0.],
# [0., 0., 0.],
# [0., 0., 0.]], device='cuda:0', dtype=torch.float16)), ('bias', tensor([0., 0., 0., 0., 0.], device='cuda:0', dtype=torch.float16))])
```
## Follow Ups
- [ ] `torch.load` semantic for skip_data context manager
- [ ] Mechanism for getting offsets of storages saved via this method (for writing in a separate pass)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134504
Approved by: https://github.com/albanD
This PR fixes a bug in `test_correct_module_names` introduced in #130497. It also addresses post-fix test failures in:
* `torch/ao/quantization/__init__.py` - set the correct `__module__` for several public API helpers
* `torch/library.py` - add `register_vmap` to `__all__`
* `torch/nn/attention/flex_attention.py` - make `round_up_to_multiple` private by prepending an underscore
* `torch/storage.py` - introduce `__all__` to avoid `Self` being re-exported as a public API
* `torch/distributed/pipelining/schedules.py` - add `ZeroBubbleAlgorithm` to `__all__`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131386
Approved by: https://github.com/albanD
This PR fixes a bug in `test_correct_module_names` introduced in #130497. It also addresses post-fix test failures in:
* `torch/ao/quantization/__init__.py` - set the correct `__module__` for several public API helpers
* `torch/library.py` - add `register_vmap` to `__all__`
* `torch/nn/attention/flex_attention.py` - make `round_up_to_multiple` private by prepending an underscore
* `torch/storage.py` - introduce `__all__` to avoid `Self` being re-exported as a public API
* `torch/distributed/pipelining/schedules.py` - add `ZeroBubbleAlgorithm` to `__all__`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131386
Approved by: https://github.com/albanD
Adds the following to allowed globals for the `weights_only` unpickler
- [x] `torch._utils._rebuild_qtensor` and qtensor related types
- [x] `torch._utils._rebuild_parameter_with_state` (used deserializing a parameter that has user-defined attributes like `Param.foo`)
The remaining rebuild functions that have not been allowlisted are
- [x] `torch._utils._rebuild_wrapper_subclass` (allowlisted in above PR)
- [ ] `torch._utils._rebuild_device_tensor_from_numpy`
- [ ] `torch._utils._rebuild_xla_tensor` (legacy)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124330
Approved by: https://github.com/albanD
Fixes#112589
Fixed errors relating to pydocstyle in the following files. The remaining errors are related to docstrings at the module level and at methods within each module (see details below)
pydocstyle torch/cuda/_utils.py --count
before: 3
after: 0
pydocstyle torch/cuda/jiterator.py --count
before: 3
after: 1
**remaining errors:**
```
torch/cuda/jiterator.py:1 at module level:
D100: Missing docstring in public module
```
pydocstyle torch/cuda/graphs.py --count
before: 25
after: 7
**remaining errors:**
```
torch/cuda/graphs.py:1 at module level:
D100: Missing docstring in public module
torch/cuda/graphs.py:54 in public method `__new__`:
D102: Missing docstring in public method
torch/cuda/graphs.py:108 in public method `debug_dump`:
D205: 1 blank line required between summary line and description (found 0)
torch/cuda/graphs.py:108 in public method `debug_dump`:
D400: First line should end with a period (not ':')
torch/cuda/graphs.py:150 in public method `__init__`:
D107: Missing docstring in __init__
torch/cuda/graphs.py:172 in public method `__enter__`:
D105: Missing docstring in magic method
torch/cuda/graphs.py:186 in public method `__exit__`:
D105: Missing docstring in magic method
```
pydocstyle torch/cuda/_sanitizer.py --count
before: 35
after: 31
**remaining errors:**
```
torch/cuda/_sanitizer.py:43 in public class `AccessType`:
D101: Missing docstring in public class
torch/cuda/_sanitizer.py:47 in public method `__str__`:
D105: Missing docstring in magic method
torch/cuda/_sanitizer.py:84 in public method `__init__`:
D107: Missing docstring in __init__
torch/cuda/_sanitizer.py:96 in public method `__str__`:
D105: Missing docstring in magic method
torch/cuda/_sanitizer.py:139 in public method `__init__`:
D107: Missing docstring in __init__
torch/cuda/_sanitizer.py:142 in public method `__str__`:
D105: Missing docstring in magic method
torch/cuda/_sanitizer.py:218 in public class `StreamSynchronizations`:
D101: Missing docstring in public class
torch/cuda/_sanitizer.py:219 in public method `__init__`:
D107: Missing docstring in __init__
torch/cuda/_sanitizer.py:256 in public method `create_stream`:
D102: Missing docstring in public method
torch/cuda/_sanitizer.py:268 in public method `create_event`:
D102: Missing docstring in public method
torch/cuda/_sanitizer.py:272 in public method `delete_event`:
D102: Missing docstring in public method
torch/cuda/_sanitizer.py:276 in public method `update_seq_num`:
D102: Missing docstring in public method
torch/cuda/_sanitizer.py:280 in public method `record_state`:
D102: Missing docstring in public method
torch/cuda/_sanitizer.py:291 in public method `stream_wait_for_event`:
D102: Missing docstring in public method
torch/cuda/_sanitizer.py:298 in public method `all_streams_wait_for_event`:
D102: Missing docstring in public method
torch/cuda/_sanitizer.py:307 in public method `all_streams_wait_for_stream`:
D102: Missing docstring in public method
torch/cuda/_sanitizer.py:316 in public method `sync_all_streams`:
D102: Missing docstring in public method
torch/cuda/_sanitizer.py:323 in public method `is_ordered_after`:
D102: Missing docstring in public method
torch/cuda/_sanitizer.py:339 in public method `__init__`:
D107: Missing docstring in __init__
torch/cuda/_sanitizer.py:460 in public function `zip_by_key`:
D103: Missing docstring in public function
torch/cuda/_sanitizer.py:466 in public function `zip_arguments`:
D103: Missing docstring in public function
torch/cuda/_sanitizer.py:478 in public class `ArgumentHandler`:
D101: Missing docstring in public class
torch/cuda/_sanitizer.py:479 in public method `__init__`:
D107: Missing docstring in __init__
torch/cuda/_sanitizer.py:505 in public method `parse_inputs`:
D102: Missing docstring in public method
torch/cuda/_sanitizer.py:520 in public method `parse_outputs`:
D102: Missing docstring in public method
torch/cuda/_sanitizer.py:527 in public class `CUDASanitizerDispatchMode`:
D101: Missing docstring in public class
torch/cuda/_sanitizer.py:528 in public method `__init__`:
D107: Missing docstring in __init__
torch/cuda/_sanitizer.py:562 in public method `__torch_dispatch__`:
D105: Missing docstring in magic method
torch/cuda/_sanitizer.py:597 in public method `__init__`:
D107: Missing docstring in __init__
torch/cuda/_sanitizer.py:601 in public method `enable`:
D102: Missing docstring in public method
torch/cuda/_sanitizer.py:605 in public method `__del__`:
D105: Missing docstring in magic method
```
pydocstyle torch/storage.py --count
before: 90
after: 37
**remaining errors:**
```
torch/storage.py:1 at module level:
D100: Missing docstring in public module
torch/storage.py:310 in public class `UntypedStorage`:
D101: Missing docstring in public class
torch/storage.py:311 in public method `__getitem__`:
D105: Missing docstring in magic method
torch/storage.py:317 in public method `is_cuda`:
D102: Missing docstring in public method
torch/storage.py:321 in public method `is_hpu`:
D102: Missing docstring in public method
torch/storage.py:325 in public method `share_memory_`:
D102: Missing docstring in public method
torch/storage.py:444 in public class `TypedStorage`:
D101: Missing docstring in public class
torch/storage.py:453 in public method `fill_`:
D102: Missing docstring in public method
torch/storage.py:458 in public method `__new__`:
D102: Missing docstring in public method
torch/storage.py:530 in public method `__init__`:
D107: Missing docstring in __init__
torch/storage.py:599 in public method `is_cuda`:
D102: Missing docstring in public method
torch/storage.py:604 in public method `is_hpu`:
D102: Missing docstring in public method
torch/storage.py:624 in public method `__len__`:
D105: Missing docstring in magic method
torch/storage.py:653 in public method `__setitem__`:
D105: Missing docstring in magic method
torch/storage.py:681 in public method `__getitem__`:
D105: Missing docstring in magic method
torch/storage.py:715 in public method `copy_`:
D102: Missing docstring in public method
torch/storage.py:723 in public method `nbytes`:
D102: Missing docstring in public method
torch/storage.py:731 in public method `type`:
D102: Missing docstring in public method
torch/storage.py:744 in public method `cuda`:
D102: Missing docstring in public method
torch/storage.py:751 in public method `hpu`:
D102: Missing docstring in public method
torch/storage.py:758 in public method `element_size`:
D102: Missing docstring in public method
torch/storage.py:766 in public method `get_device`:
D102: Missing docstring in public method
torch/storage.py:770 in public method `__str__`:
D105: Missing docstring in magic method
torch/storage.py:781 in public method `__repr__`:
D105: Missing docstring in magic method
torch/storage.py:785 in public method `__iter__`:
D105: Missing docstring in magic method
torch/storage.py:789 in public method `__copy__`:
D105: Missing docstring in magic method
torch/storage.py:793 in public method `__deepcopy__`:
D105: Missing docstring in magic method
torch/storage.py:801 in public method `__sizeof__`:
D105: Missing docstring in magic method
torch/storage.py:877 in public method `device`:
D102: Missing docstring in public method
torch/storage.py:881 in public method `size`:
D102: Missing docstring in public method
torch/storage.py:891 in public method `pickle_storage_type`:
D102: Missing docstring in public method
torch/storage.py:902 in public method `__reduce__`:
D105: Missing docstring in magic method
torch/storage.py:907 in public method `data_ptr`:
D102: Missing docstring in public method
torch/storage.py:915 in public method `resize_`:
D102: Missing docstring in public method
torch/storage.py:931 in public method `from_buffer`:
D102: Missing docstring in public method
torch/storage.py:1032 in public method `from_file`:
D402: First line should not be the function's "signature"
torch/storage.py:1075 in public method `is_shared`:
D102: Missing docstring in public method
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113227
Approved by: https://github.com/kit1980
There is an issue with float8 type promotion, because _promoteTypesLookup doesn't contain records for few types between bfloat16 and float8.
I have simply moved float8 types just after bfloat16, however I'm not sure if it doesn't break serialization.
Please, decide if it can stay like this, or should I insert missing records filled with "ud" into _promoteTypesLookup instead of moving types.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110279
Approved by: https://github.com/albanD
This PR re-lands
- [Typing] Fix PEP 484 Violation (#105022)
- Update mypy to 1.4.1 (#91983)
That were reverted due to the conflict with internal source repo.
Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
- Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
- Add missing return statement to `torch._export. deserialize_graph`
- Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
- Add assert it `torch/optim/optimizer.py` that Optional list is not None
TODO (in followup PR):
- Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`
Unrelated, to bypass CI failures due to the gcc9 dependency update in Ubuntu-18.04:
- Add hack to squash older libstdc++ from conda environment in favor one from OS to `.ci/docker/install_conda.sh`
- Update bazel cuda builds to focal, as with libstdc++-6.0.32 bazel builds loose the ability to catch exceptions (probably because they link with cupti statically, but I could not found where it is done)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227
Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007
This PR re-lands
- [Typing] Fix PEP 484 Violation (#105022)
- Update mypy to 1.4.1 (#91983)
That were reverted due to the conflict with internal source repo.
Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
- Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
- Add missing return statement to `torch._export. deserialize_graph`
- Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
- Add assert it `torch/optim/optimizer.py` that Optional list is not None
TODO (in followup PR):
- Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227
Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007
This PR addresses #101690. This PR implement faster data elements swap in `_StorageBase` using C++ rather than using Python.
This PR helps such a situation that a large model saved on a little-endian machine will be loaded on a big-endian machine.
TODO:
- [x] Add test cases
- [x] Add performance comparison before and after the PR
- [ ] (Optional) Investigate further opportunities for performance improvements by [SIMDization](https://dev.to/wunk/fast-array-reversal-with-simd-j3p)
Fixes#101690
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101925
Approved by: https://github.com/mikaylagawarecki