pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Aaron Orenstein	3a0d088517	Flip default value for mypy disallow_untyped_defs [5/11] (#127842 ) See #127836 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127842 Approved by: https://github.com/oulgen	2024-06-08 18:49:18 +00:00
Kevin Yin	534c34b320	Fix copy-pasted docs, reversing the load and save description (#125993 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/125993 Approved by: https://github.com/kwen2501, https://github.com/fegin	2024-05-14 21:14:16 +00:00
Lucas Pasqualin	bb6ba31250	[DCP] Adds storage metadata, and passes it during the save path (#124772 ) This PR seeks to increase observability of save/load requests. This is accomplished with two main changes: 1. The creation of save_id and load_id: - a save_id and load_id is added to the filesystem writer. `save_id` is re-generated on every save call, and `load_id` is also re-generated on every load call. - both these ID's are stored in a new `StorageMeta` class, and saved as part of Metadata. (`load_id` is None when we save, and only set during load) 2. A new mechanism is implemented in the save path which gives the SavePlanner a chance to inspect the `storage_meta` object. The mechanism mirrors the same metadata exchange in the load path. In the load path, `storage_meta` is added to `metadata` such that the LoadPlanner can also access `storage_meta` before we begin loading. If users now wish to access the checkpoint_id in the SavePlanner, they simple need to access the value in `storage_meta` from the `set_up_planner` call Additionally, users now have a generic way of passing data to the SavePlanner from the StorageWriter at the start of the save path, similar to the load path This PR has been tested for backwards compatibility -- meaning any checkpoints saved before this PR can continue being loaded after this PR. One major consideration is that there is limited forwards compatibility. If a checkpoint is generated _past_ this PR, there is no support for loading it using older torch versions. This brings up a fairly important point: since we expect the metadata object (which is saved to the disk) to continue evolving, and we want to support forwards compatibility, we explore patching `pickle` so we can at least add new members to `metadata` and maintain fwd compat. Pull Request resolved: https://github.com/pytorch/pytorch/pull/124772 Approved by: https://github.com/fegin	2024-05-07 23:53:53 +00:00
Lucas Pasqualin	18c9d46068	Fixes format utils executable (#123407 ) Fixes an issue with the format utils executable, which was causing it to run as a no-op. :( Pull Request resolved: https://github.com/pytorch/pytorch/pull/123407 Approved by: https://github.com/wz337, https://github.com/fegin	2024-04-05 03:53:22 +00:00
Lucas Pasqualin	bcb6e5aa72	[DCP] Support partial load (#122829 ) Adds ability to load a subset of keys directly from a checkpoint, avoiding the need to initialize state dict first Differential Revision: [D55441391](https://our.internmc.facebook.com/intern/diff/D55441391/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/122829 Approved by: https://github.com/fegin	2024-04-02 19:22:22 +00:00
Lucas Pasqualin	909d73d8cb	[DCP] Removes `no_dist` and `coordinator_rank` from public DCP API's (#121317 ) [DCP] Removes `no_dist` and `coordinator_rank` from public DCP API's Differential Revision: [D54591181](https://our.internmc.facebook.com/intern/diff/D54591181/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/121317 Approved by: https://github.com/fegin	2024-03-08 02:14:12 +00:00
Lucas Pasqualin	eb1145436a	[DCP] Adds main in format utils (#120128 ) Adds main in format utils. Usage: `python -m torch.distributed.checkpoint.format_utils dcp_to_torch dcp_dir torch_file.pt` or `python -m torch.distributed.checkpoint.format_utils torch_to_dcp torch_file.pt dcp_dir` Differential Revision: [D53791355](https://our.internmc.facebook.com/intern/diff/D53791355/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/120128 Approved by: https://github.com/fegin, https://github.com/wz337	2024-03-07 01:18:17 +00:00
Lucas Pasqualin	9d5dea7812	[DCP] Adds storage reader and planner classes for online loading/sharding of models in torch.save format (#119816 ) as title Differential Revision: [D53718041](https://our.internmc.facebook.com/intern/diff/D53718041/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119816 Approved by: https://github.com/fegin	2024-03-01 00:21:05 +00:00
Lucas Pasqualin	1c1028ac49	[DCP] Adds utility for converting torch save to dcp (#119815 ) as title Differential Revision: [D53718040](https://our.internmc.facebook.com/intern/diff/D53718040/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119815 Approved by: https://github.com/fegin ghstack dependencies: #119813, #119814	2024-02-22 17:22:11 +00:00
Lucas Pasqualin	1ab441a7dd	[DCP] Adds utility for converting dcp to torch save format (#119814 ) as title Differential Revision: [D53718042](https://our.internmc.facebook.com/intern/diff/D53718042/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119814 Approved by: https://github.com/fegin ghstack dependencies: #119813	2024-02-22 16:55:58 +00:00

10 Commits