Reimplement storage slicing. (#11314)

Summary: In #9466 I got rid of storage views and eliminated all places where they were used... OR SO I THOUGHT. In actuality, under certain conditions (specifically, if you trained a CUDA multiprocessing model shared over CUDA IPC and then serialized your parameters), you could also serialize storage slices to the saved model format. In #9466, I "fixed" the case when you loaded the legacy model format (really, just unshared the storages--not strictly kosher but if you aren't updating the parameters, shouldn't matter), but NOT the modern model format, so such models would fail. So, I could have applied the legacy model format fix too, but hyperfraise remarked that he had applied a fix that was effectively the same as unsharing the storages, but it had caused his model to behave differently. So I looked into it again, and realized that using a custom deleter, I could simulate the same behavior as old storage slices. So back they come. In principle, I could also reimplement storage views entirely using our allocators, but I'm not going to do that unless someone really really wants it. Fixes #10120. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/11314 Reviewed By: ailzhang Differential Revision: D9671966 Pulled By: ezyang fbshipit-source-id: fd863783d03b6a6421d6b9ae21ce2f0e44a0dcce
2025-10-21 05:34:18 +08:00 · 2018-09-06 16:06:25 -07:00
parent 1d406c04ae
commit 49231ab0a8
3 changed files with 56 additions and 16 deletions
--- a/torch/serialization.py
+++ b/torch/serialization.py
@ -451,20 +451,7 @@ def _load(f, map_location, pickle_module):
                storage_views = pickle_module.load(f)
                for target_cdata, root_cdata, offset, size in storage_views:
                    root = deserialized_objects[root_cdata]
-                    if offset != 0 or size != root.size():
-                        warnings.warn("Detected storage view in legacy serialized data: "
-                                      "storage views are no longer natively supported, so we are making "
-                                      "a copy of the data instead.  THIS IS A SEMANTIC CHANGE! "
-                                      "If you need aliasing, reserialize your model using "
-                                      "tensors that share storage.")
-
-                        tensor = torch._utils._rebuild_tensor(root, offset, (size,), (1,))
-                        obj = tensor.clone().storage()
-                    else:
-                        # NB: This line does not appear to be exercised by the
-                        # test suite.
-                        obj = root
-                    deserialized_objects[target_cdata] = obj
+                    deserialized_objects[target_cdata] = root[offset:offset + size]

            tar.extract('tensors', path=tmpdir)
            with open(os.path.join(tmpdir, 'tensors'), 'rb', 0) as f: