pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-11-06 00:54:56 +08:00

Author	SHA1	Message	Date
Wouter Devriendt	bae3426af7	reimport pr137735 due to merging check issues (#138959 ) This is a cherry-pick from #137735 by @mikaylagawarecki , that cannot be merged due to a (wrongly) failing check for codev @diff-train-skip-merge Pull Request resolved: https://github.com/pytorch/pytorch/pull/138959 Approved by: https://github.com/mikaylagawarecki	2024-10-27 16:31:34 +00:00
PyTorch MergeBot	dd32a32cb6	Revert "Expose option to disable CRC-32 computation during `torch.save` (#137735 )" This reverts commit 534fa96f2d9a4feb1dcdfaecb3d73990db60f819. Reverted https://github.com/pytorch/pytorch/pull/137735 on behalf of https://github.com/clee2000 due to failing internally D64438525, probably needs gating ([comment](https://github.com/pytorch/pytorch/pull/137735#issuecomment-2417412264))	2024-10-16 17:03:06 +00:00
Mikayla Gawarecki	534fa96f2d	Expose option to disable CRC-32 computation during `torch.save` (#137735 ) Option only works in open source, not internal Pull Request resolved: https://github.com/pytorch/pytorch/pull/137735 Approved by: https://github.com/albanD	2024-10-15 19:30:02 +00:00
Mikayla Gawarecki	a096f2899d	Add torch.serialization.skip_data context manager (#134504 ) ## Semantic The semantic is (1) By default `torch.serialization.skip_data(materialize_fake_tensors=False)` will make `torch.save` skip writing storages (but reserve space for them in the checkpoint). ```python import torch import torch.nn as nn sd = nn.Linear(3, 5).state_dict() with torch.serialization.skip_data(): torch.save(sd, 'foo.pt') print(torch.load('foo.pt', weights_only=True)) ``` (2) With `torch.serialization.skip_data(materialize_fake_tensors=True)`If FakeTensor is passed to `torch.save` the pickler will treat these FakeTensors as being "materialized" space will be reserved in the checkpoint for the associated storage bytes, and when loading the type will be Tensor instead of FakeTensor) ```python import torch import torch.nn as nn from torch._subclasses.fake_tensor import FakeTensorMode with FakeTensorMode(): m = nn.Linear(3, 5, dtype=torch.float16, device='cuda') sd = m.state_dict() with torch.serialization.skip_data(materialize_fake_tensors=True): torch.save(sd, 'bla.pt') print(torch.load('bla.pt', weights_only=True)) # OrderedDict([('weight', tensor([[0., 0., 0.], # [0., 0., 0.], # [0., 0., 0.], # [0., 0., 0.], # [0., 0., 0.]], device='cuda:0', dtype=torch.float16)), ('bias', tensor([0., 0., 0., 0., 0.], device='cuda:0', dtype=torch.float16))]) ``` ## Follow Ups - [ ] `torch.load` semantic for skip_data context manager - [ ] Mechanism for getting offsets of storages saved via this method (for writing in a separate pass) Differential Revision: [D62238610](https://our.internmc.facebook.com/intern/diff/D62238610) Pull Request resolved: https://github.com/pytorch/pytorch/pull/134504 Approved by: https://github.com/albanD	2024-09-05 16:53:39 +00:00
PyTorch MergeBot	2fd36086bc	Revert "Add torch.serialization.skip_data context manager (#134504 )" This reverts commit 94db935749b8de99d8c3ab23fb880c67c8f3e67a. Reverted https://github.com/pytorch/pytorch/pull/134504 on behalf of https://github.com/kit1980 due to See D62082697 ([comment](https://github.com/pytorch/pytorch/pull/134504#issuecomment-2327542276))	2024-09-03 22:21:27 +00:00
Mikayla Gawarecki	94db935749	Add torch.serialization.skip_data context manager (#134504 ) ## Semantic The semantic is (1) By default `torch.serialization.skip_data(materialize_fake_tensors=False)` will make `torch.save` skip writing storages (but reserve space for them in the checkpoint). ```python import torch import torch.nn as nn sd = nn.Linear(3, 5).state_dict() with torch.serialization.skip_data(): torch.save(sd, 'foo.pt') print(torch.load('foo.pt', weights_only=True)) ``` (2) With `torch.serialization.skip_data(materialize_fake_tensors=True)`If FakeTensor is passed to `torch.save` the pickler will treat these FakeTensors as being "materialized" space will be reserved in the checkpoint for the associated storage bytes, and when loading the type will be Tensor instead of FakeTensor) ```python import torch import torch.nn as nn from torch._subclasses.fake_tensor import FakeTensorMode with FakeTensorMode(): m = nn.Linear(3, 5, dtype=torch.float16, device='cuda') sd = m.state_dict() with torch.serialization.skip_data(materialize_fake_tensors=True): torch.save(sd, 'bla.pt') print(torch.load('bla.pt', weights_only=True)) # OrderedDict([('weight', tensor([[0., 0., 0.], # [0., 0., 0.], # [0., 0., 0.], # [0., 0., 0.], # [0., 0., 0.]], device='cuda:0', dtype=torch.float16)), ('bias', tensor([0., 0., 0., 0., 0.], device='cuda:0', dtype=torch.float16))]) ``` ## Follow Ups - [ ] `torch.load` semantic for skip_data context manager - [ ] Mechanism for getting offsets of storages saved via this method (for writing in a separate pass) Pull Request resolved: https://github.com/pytorch/pytorch/pull/134504 Approved by: https://github.com/albanD	2024-08-29 04:52:52 +00:00
PyTorch MergeBot	1285443994	Revert "Add torch.serialization.skip_data context manager (#134504 )" This reverts commit 202600bc2384cb19a29b8fca503bafc289158c32. Reverted https://github.com/pytorch/pytorch/pull/134504 on behalf of https://github.com/mikaylagawarecki due to This is breaking Windows docs tests due to NamedTemporaryFile on Windows not working well ([comment](https://github.com/pytorch/pytorch/pull/134504#issuecomment-2316543901))	2024-08-29 01:30:49 +00:00
Mikayla Gawarecki	202600bc23	Add torch.serialization.skip_data context manager (#134504 ) ## Semantic The semantic is (1) By default `torch.serialization.skip_data(materialize_fake_tensors=False)` will make `torch.save` skip writing storages (but reserve space for them in the checkpoint). ```python import torch import torch.nn as nn sd = nn.Linear(3, 5).state_dict() with torch.serialization.skip_data(): torch.save(sd, 'foo.pt') print(torch.load('foo.pt', weights_only=True)) ``` (2) With `torch.serialization.skip_data(materialize_fake_tensors=True)`If FakeTensor is passed to `torch.save` the pickler will treat these FakeTensors as being "materialized" space will be reserved in the checkpoint for the associated storage bytes, and when loading the type will be Tensor instead of FakeTensor) ```python import torch import torch.nn as nn from torch._subclasses.fake_tensor import FakeTensorMode with FakeTensorMode(): m = nn.Linear(3, 5, dtype=torch.float16, device='cuda') sd = m.state_dict() with torch.serialization.skip_data(materialize_fake_tensors=True): torch.save(sd, 'bla.pt') print(torch.load('bla.pt', weights_only=True)) # OrderedDict([('weight', tensor([[0., 0., 0.], # [0., 0., 0.], # [0., 0., 0.], # [0., 0., 0.], # [0., 0., 0.]], device='cuda:0', dtype=torch.float16)), ('bias', tensor([0., 0., 0., 0., 0.], device='cuda:0', dtype=torch.float16))]) ``` ## Follow Ups - [ ] `torch.load` semantic for skip_data context manager - [ ] Mechanism for getting offsets of storages saved via this method (for writing in a separate pass) Pull Request resolved: https://github.com/pytorch/pytorch/pull/134504 Approved by: https://github.com/albanD	2024-08-28 23:53:17 +00:00
Mikayla Gawarecki	7c289c2a5c	Add torch.serialization.safe_globals context manager (#127939 ) Add context manager mentioned in https://github.com/pytorch/pytorch/pull/127808#pullrequestreview-2096298486 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127939 Approved by: https://github.com/albanD	2024-07-12 20:38:43 +00:00
Mikayla Gawarecki	66dc8fb7ff	Allow tensor subclasses and add `torch.serialization.add_safe_globals` that allows users to allowlist classes for `weights_only` load (#124331 ) #### Conditions for allowlisting tensor subclasses We allow tensor subclasses types that (1) Do not override `__setstate__`, `__getattr__`, `__setattr__`, `__get__`, `__set__` or `__getattribute__` of `torch.Tensor` (`torch.Tensor` does not have a definition of `__getattr__`, `__get__` or `__set__` so we check that these are `None`) (2) Use the generic `tp_alloc` (3) Are in a module that has been imported by the user to be pushed onto the stack as strings by `GLOBAL` instructions, while storing the type in a dict The strings will be converted to the classes as appropriate when executing `REBUILD` with `_rebuild_from_type_v2` Note that we use `inspect.getattr_static(sys.modules[module], name)` to get the class/function as this method claims to have no code execution. The rationale for the 3 conditions above is as follows: The rebuild func provided by `Tensor.__reduce_ex__` is `torch._tensor._rebuild_from_type_v2`, which is defined as such (note the call to `getattr`, `Tensor.__setstate__` and the call to `as_subclass` as well as the call to `_set_obj_state` which calls `setattr`) `4e66aaa010/torch/_tensor.py (L57-L71)` `as_subclass` is implemented with a call to `THPVariable_NewWithVar` that will eventually call `tp_alloc` here `4e66aaa010/torch/csrc/autograd/python_variable.cpp (L2053)` The `func` arg to `_rebuild_from_type_v2` for wrapper subclasses is `Tensor.rebuild_wrapper_subclass`, which will similarly call into `THPVariable_NewWithVar` and hit the above `tp_alloc` Note that we do not call `tp_init` or `tp_new` (i.e. `cls.__init__` or `cls.__new__`) when unpickling* ### How do we check something is a tensor subclass/constraints around imports In order to check whether `bla` is a tensor subclass in the bytecode `GLOBAL module.name`, we need to do an `issubclass` check, which entails converting the global string to the appropriate type. We do not arbitrarily import modules but will perform this check as long as the given subclass (given by `module.name`) has already been imported by the user (i.e. `module in sys.modules` and `issubclass(getattr(sys[modules], name), torch.Tensor)` This PR also allowlisted `torch._utils._rebuild_wrapper_subclass` and `torch.device` (used by `_rebuild_wrapper_subclass`) ### API for allow listing This PR also added `torch.serialization.{add/get/clear}_safe_globals` that enables user to allowlist globals they have deemed safe and manipulate this list (for example they could allowlist a tensor subclass with a custom `__setstate__` if they have checked that this is safe). Next steps: - Add testing and allowlist required classes for all in-core tensor subclasses (e.g. `DTensor`, `FakeTensor` etc.) Pull Request resolved: https://github.com/pytorch/pytorch/pull/124331 Approved by: https://github.com/albanD	2024-05-17 17:56:57 +00:00
Mikayla Gawarecki	2480e8b8a1	Add MAP_SHARED option for torch.load(mmap=True) (#124889 ) Fixes #124528 Going over the options for our MapAllocator and what they do, I don't think any other of them need to be piped up to `torch.load` `4f29103749/aten/src/ATen/MapAllocator.h (L8-L16)` ~However, I wonder if this `MmapVisibility(Enum)` is a good way to represent "or-ing" together of `mmap` flags if we want to extend it in the future. I looked over the flags for [`mmap(2)`](https://man7.org/linux/man-pages/man2/mmap.2.html), and could not immediately see how most of them would be useful for `torch.load` (would maybe `MAP_LOCKED` (like `mlock`) or `MAP_HUGE` ever be worthwhile?)~ Using the flags provided by the python `mmap` library so that we can extend the allowed flags and pipe them down to the cpp `mmap` call if there is a need for other flags in the future Pull Request resolved: https://github.com/pytorch/pytorch/pull/124889 Approved by: https://github.com/albanD	2024-04-30 15:02:19 +00:00
Mikayla Gawarecki	9ffed22391	Document file format returned by torch.save (#118719 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/118719 Approved by: https://github.com/albanD	2024-02-03 02:11:44 +00:00
Aleksei Nikiforov	c42fd73cf9	Add functions to get and set default endianness in load() functions (#101973 ) By default interpret tensor data as native endian, but add an option to interpret data as little endian or big endian. Related to #101688 Pull Request resolved: https://github.com/pytorch/pytorch/pull/101973 Approved by: https://github.com/mikaylagawarecki	2023-07-06 20:12:56 +00:00
Mikayla Gawarecki	981f24e806	Add docstring to torch.serialization.register_package (#104046 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104046 Approved by: https://github.com/albanD	2023-06-26 23:28:32 +00:00
Svetlana Karslioglu	d425da8bf3	Replace master with main in links and docs/conf.py (#100176 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/100176 Approved by: https://github.com/albanD, https://github.com/malfet	2023-05-02 18:20:32 +00:00
Xuehai Pan	8d45f555d7	[BE] [1/3] Rewrite `super()` calls in caffe2 and benchmarks (#94587 ) Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied. - #94587 - #94588 - #94592 Also, methods with only a `super()` call are removed: ```diff class MyModule(nn.Module): - def __init__(self): - super().__init__() - def forward(self, ...): ... ``` Some cases that change the semantics should be kept unchanged. E.g.: `f152a79be9/caffe2/python/net_printer.py (L184-L190)` `f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/94587 Approved by: https://github.com/ezyang	2023-02-11 18:19:48 +00:00
bowen0701	e803d336eb	Fix missing indentation in serialization.rst (#91253 ) Fixes #ISSUE_NUMBER In serialization.rst, fix class ControlFlowModule's forward(): missing indentation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91253 Approved by: https://github.com/kit1980	2022-12-21 20:14:44 +00:00
Rhys Goodall	62ba548cac	[DOC] Missing line in serialization notes (#79454 ) Small typo fix to serialization docs where there was a missing line in one of the examples. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79454 Approved by: https://github.com/mruberry	2022-06-17 18:26:47 +00:00
Jeff Yang	475251631b	docs: reference links to serialization.html (#54659 ) Summary: fixes https://github.com/pytorch/pytorch/issues/54311 https://11811979-65600975-gh.circle-artifacts.com/0/docs/generated/torch.save.html Pull Request resolved: https://github.com/pytorch/pytorch/pull/54659 Reviewed By: ailzhang Differential Revision: D27328281 Pulled By: zou3519 fbshipit-source-id: b88d02e5407238a338d537d013a297ae9cdf922b	2021-03-29 10:15:07 -07:00
Sam Estep	8c798e0622	Forbid trailing whitespace (#53406 ) Summary: Context: https://github.com/pytorch/pytorch/pull/53299#discussion_r587882857 These are the only hand-written parts of this diff: - the addition to `.github/workflows/lint.yml` - the file endings changed in these four files (to appease FB-internal land-blocking lints): - `GLOSSARY.md` - `aten/src/ATen/core/op_registration/README.md` - `scripts/README.md` - `torch/csrc/jit/codegen/fuser/README.md` The rest was generated by running this command (on macOS): ``` git grep -I -l ' $' -- . ':(exclude)/contrib/' ':(exclude)third_party' \| xargs gsed -i 's/ *$//' ``` I looked over the auto-generated changes and didn't see anything that looked problematic. Pull Request resolved: https://github.com/pytorch/pytorch/pull/53406 Test Plan: This run (after adding the lint but before removing existing trailing spaces) failed: - https://github.com/pytorch/pytorch/runs/2043032377 This run (on the tip of this PR) succeeded: - https://github.com/pytorch/pytorch/runs/2043296348 Reviewed By: walterddr, seemethere Differential Revision: D26856620 Pulled By: samestep fbshipit-source-id: 3f0de7f7c2e4b0f1c089eac9b5085a58dd7e0d97	2021-03-05 17:22:55 -08:00
Mike Ruberry	a0e58996fb	Makes the use of the term "module" consistent through the serialization note (#41563 ) Summary: module -> torch.nn.Module or ScriptModule, as appropriate. + bonus grammar fix. Pull Request resolved: https://github.com/pytorch/pytorch/pull/41563 Reviewed By: gchanan Differential Revision: D22584173 Pulled By: mruberry fbshipit-source-id: 8c90f1f9a194bfdb277c97cf02c9b8c1c6ddc601	2020-07-16 14:59:49 -07:00
Mike Ruberry	f49d97a848	Notes for lcm and gcd, formatting doc fixes (#41526 ) Summary: A small PR fixing some formatting in lcm, gcd, and the serialization note. Adds a note to lcm and gcd explaining behavior that is not always defined. Pull Request resolved: https://github.com/pytorch/pytorch/pull/41526 Reviewed By: ngimel Differential Revision: D22569341 Pulled By: mruberry fbshipit-source-id: 5f5ff98c0831f65e82b991ef444a5cee8e3c8b5a	2020-07-16 13:15:29 -07:00
Mike Ruberry	60f2fa6a84	Updates serialization note to explain versioned symbols and dynamic versioning (#41395 ) Summary: Doc update intended to clarify and expand our current serialization behavior, including explaining the difference between torch.save/torch.load, torch.nn.Module.state_dict/torch.nn.Module.load_state_dict, and torch.jit.save/torch.jit.load. Also explains, for the time, when historic serialized Torchscript behavior is preserved and our recommendation for preserving behavior (using the same PyTorch version to consume a model as produced it). Pull Request resolved: https://github.com/pytorch/pytorch/pull/41395 Reviewed By: ngimel Differential Revision: D22560538 Pulled By: mruberry fbshipit-source-id: dbc2f1bb92ab61ff2eca4888febc21f7dda76ba1	2020-07-15 19:05:19 -07:00
Ailing Zhang	d7cd16858f	Add documentation about storage sharing is preserved and serialized f… (#40412 ) Summary: …ile size. fixes https://github.com/pytorch/pytorch/issues/40157 Pull Request resolved: https://github.com/pytorch/pytorch/pull/40412 Reviewed By: ezyang Differential Revision: D22265639 Pulled By: ailzhang fbshipit-source-id: 16b0301f16038bd784e7e92f63253fedc7820adc	2020-06-29 17:23:29 -07:00
James Reed	c73095e78f	Add note to serialization docs about zipfile format (#40288 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40288 Test Plan: Imported from OSS Differential Revision: D22140324 Pulled By: jamesr66a fbshipit-source-id: 01d7aa642ed2f4e4bdac4b7f3223bf4d7e62fd4d	2020-06-19 13:40:08 -07:00
Li Dong	761d6799be	code syntax error in document (serialization.rst) (#937 )	2017-03-06 10:06:04 -05:00
Eli Stevens	88275da5e8	CUDA documentation tweaks (#858 )	2017-02-26 20:37:43 +01:00

27 Commits