pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-21 05:34:18 +08:00

Author	SHA1	Message	Date
Zeng, Xiangdong	c6392fcc06	[2/N] Port 3 fsdp distributed test cases to Intel GPU (#160940 ) For https://github.com/pytorch/pytorch/issues/114850, we will port distributed tests to Intel GPU. This is the second PR for fsdp distributed test cases, the first is https://github.com/pytorch/pytorch/pull/160158. We could enable Intel GPU with following methods and try the best to keep the original code styles: - Use "torch.accelerator.current_accelerator()" to determine the accelerator backend - Enabled XPU for some test path Pull Request resolved: https://github.com/pytorch/pytorch/pull/160940 Approved by: https://github.com/guangyey, https://github.com/d4l3k	2025-09-17 10:45:28 +00:00
Xuehai Pan	db3290846e	[BE][Easy][10/19] enforce style for empty lines in import segments in `test/d*/` (#129761 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129761 Approved by: https://github.com/fegin	2024-07-17 16:57:39 +00:00
Chien-Chin Huang	a66f2a1b99	[state_dict] Move _gather_state_dict to dcp module (#112835 ) This api is getting used by more than just FSDP. This PR moves it to DCP module. Differential Revision: [D50962966](https://our.internmc.facebook.com/intern/diff/D50962966/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/112835 Approved by: https://github.com/wz337	2023-11-08 19:42:56 +00:00
Wanchao Liang	2fa063e1e0	[device_mesh][BE] remove allgather from DM (#105614 ) For the reason similar to https://github.com/pytorch/pytorch/pull/105605 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105614 Approved by: https://github.com/rohan-varma, https://github.com/wz337, https://github.com/fduwjj	2023-07-27 01:33:05 +00:00
Iris	d991ce6da3	[FSDP][3/N]_shard_utils update for dtensor state_dict support (#103479 ) Same as https://github.com/pytorch/pytorch/pull/102545 (this branch is corrupted so have to re-submit). Pull Request resolved: https://github.com/pytorch/pytorch/pull/103479 Approved by: https://github.com/fegin	2023-06-14 06:45:28 +00:00
Chien-Chin Huang	4f62e7cb10	[FSDP][BE] Remove unused code (#99731 ) Remove the unused code. https://github.com/pytorch/pytorch/pull/99675 is duplicated and we should land this PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99731 Approved by: https://github.com/wz337	2023-04-21 23:11:37 +00:00
BowenBao	60a68477a6	Bump black version to 23.1.0 (#96578 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96578 Approved by: https://github.com/ezyang	2023-03-15 06:27:59 +00:00
Chien-Chin Huang	244690205f	[FSDP] Use _init_from_local_tensor to create ShardedTensor to avoid communication overhead (#82911 ) FSDP originally uses `_init_from_local_shards_and_global_metadata()` to create a ShardedTensor for sharded_state_dict(). We have seen some non-trivial overhead if the number of tensors is large. Using `_init_from_local_shards_and_global_metadata ` can significantly reduce the overhead. For a model with ~250 tensors in the state_dict trained with 16 GPUs, the original `sharded_state_dict` takes ~1.7 seconds and this PR reduces the overhead to ~0.6 seconds. Differential Revision: [D38452170](https://our.internmc.facebook.com/intern/diff/D38452170/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/82911 Approved by: https://github.com/awgu	2022-08-17 16:40:20 +00:00
Chien-Chin Huang	58c9d521a1	[FSDP] Implement sharded_state_dict and load_sharded_state_dict Pull Request resolved: https://github.com/pytorch/pytorch/pull/77356 Implement ShardedTensor compatible sharded_state_dict() and load_sharded_state_dict(). Algorithm overview: sharded_state_dict(): 1. Call summon_full_parameters(). 2. For each unflattened, non-sharded parameter. 2.1 Call chunk() to get the local shard of the parameter. 2.2 Create a ShardedTensor. 3. Replace the tensor in the state_dict with the newly created ShardedTensor. load_sharded_state_dict(): 1. For each unflattened, sharded parameter (ShardedTensor) in the given state_dict: 1.1 Pop out from the state_dict. 1.2 Do allgather to reconstruct the unflattened, non-sharded parameter. 2. Create a FlatParameter with the unflattened, non-sharded parameters. 3. Shard the newly created FlatParameter. 4. Insert the new FlatParameter into the state_dict. Differential Revision: [D36284983](https://our.internmc.facebook.com/intern/diff/D36284983/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36284983/)! Approved by: https://github.com/zhaojuanmao	2022-05-15 22:48:56 +00:00
Chien-Chin Huang	577c9ff854	[FSDP] Implement reshard_flatten_tensor Pull Request resolved: https://github.com/pytorch/pytorch/pull/75192 Implement reshard_flatten_tensor() to allow FSDP to reshard the flatten tensor from equally sharding (chunk) to any other one-dimensional sharding. Differential Revision: [D35361572](https://our.internmc.facebook.com/intern/diff/D35361572/) Approved by: https://github.com/rohan-varma	2022-04-12 16:54:32 +00:00

10 Commits