pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-22 06:11:27 +08:00

Author	SHA1	Message	Date
Wanchao Liang	61461f39d1	[dtensor] handle negative dim and fix TP regression (#111750 ) TP style still have some regression due to negative dim specifications, fix it by allow DTensor API to handle negative dims and normalize them. i.e. TP uses `Shard(-1)`, and then try to redistribute `Shard(1) -> Shard(-1)`, this should ideally be no-op but current it runs a decompose sharding phrase and it would turn this transformation to `Shard(1) -> Replicate -> Shard(-1)`, which is wrong and triggers unnecessary allgathers Pull Request resolved: https://github.com/pytorch/pytorch/pull/111750 Approved by: https://github.com/rohan-varma	2023-10-22 04:25:45 +00:00
wz337	6dc56d3490	[DTensor] Remove compute_local_offset from _utils.py (#109096 ) Separating internal changes with OSS changes. This PR contains removing the compute_local_offset from the OSS directory only. This replaces https://github.com/pytorch/pytorch/pull/108965 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109096 Approved by: https://github.com/wanchaol, https://github.com/fduwjj	2023-09-12 21:55:15 +00:00
wz337	13e4cce83c	[DTensor] Add util API to compute_local_shape_and_global_offset for checkpointing purpose (#107996 ) The compute_local_shape_and_global_offset API does the following: 1) Calculate both local_shape and global_offset in one API to replace two API calls (compute_local_size and compute_local_shape). 2) Generate the correct global_offset for checkpointing purposes. We are currently using compute_local_offset for downstream checkpoint components, which could lead to incorrect results. For checkpointing, we need global_offset instead of local_offset. In some cases, global_offset does not equal to local_offset, when a dimension is sharded multipe times on different mesh dimension (e.g. placements = [Shard(0), Shard(0)]). Follow-up PRs: 1) Replace related downstream components to use compute_local_shape_and_global_offset instead of compute_local_size and compute_local_offset. 2) Audit existing code base to see if we can remove compute_local_size and compute_local_offset, since they are currently being used. cc. @wanchaol Pull Request resolved: https://github.com/pytorch/pytorch/pull/107996 Approved by: https://github.com/wanchaol	2023-08-30 02:46:50 +00:00
fduwjj	92923aca61	[TP] Use Stride inferred from local tensor in to_local bwd (#102630 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102630 Approved by: https://github.com/wanchaol	2023-06-01 04:30:24 +00:00
Wanchao Liang	3ae612ba7f	[dtensor] remove assertions about submesh checks (#101229 ) This PR removes assertions from submesh checks to directly return local tensor, this is so that all the other APIs can work with submesh Pull Request resolved: https://github.com/pytorch/pytorch/pull/101229 Approved by: https://github.com/fduwjj	2023-05-12 04:20:35 +00:00
Shen Li	02179827cb	[Easy] Include SPMD and DTensor files in UFMT checks (#98148 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98148 Approved by: https://github.com/fegin	2023-04-02 15:34:49 +00:00
Wanchao Liang	789fc4c292	[dtensor] refactor shape/offset calculation (#95923 ) Shape offset calculation is commonly used and extract them into a separate util Pull Request resolved: https://github.com/pytorch/pytorch/pull/95923 Approved by: https://github.com/fduwjj	2023-03-05 06:33:32 +00:00

7 Commits