Commit Graph

13 Commits

Author SHA1 Message Date
995df34b19 [BE][PYFMT] migrate PYFMT for torch.{distributed,distributions} to ruff format (#144547)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144547
Approved by: https://github.com/kwen2501
2025-02-28 07:35:56 +00:00
00ffeca1b1 PEP585 update - torch/distributed (#145164)
See #145101 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145164
Approved by: https://github.com/bobrenjc93
2025-01-21 04:23:29 +00:00
6374332d33 Revert "PEP585 update - torch/distributed (#145164)"
This reverts commit 6cb186e279bc179a6bb63f0226e24ab42a07b394.

Reverted https://github.com/pytorch/pytorch/pull/145164 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it is failing an inductor test ([comment](https://github.com/pytorch/pytorch/pull/145164#issuecomment-2602875679))
2025-01-20 16:46:46 +00:00
6cb186e279 PEP585 update - torch/distributed (#145164)
See #145101 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145164
Approved by: https://github.com/bobrenjc93
2025-01-20 00:19:01 +00:00
3b798df853 [BE][Easy] enable UFMT for torch/distributed/{fsdp,optim,rpc}/ (#128869)
Part of #123062

- #123062

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128869
Approved by: https://github.com/fegin
ghstack dependencies: #128868
2024-06-18 21:49:08 +00:00
de370eb313 [Distributed] Small nits to apply_optimizer_in_backward (#110903)
Clarify a few things around the documentation

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110903
Approved by: https://github.com/janeyx99
2023-10-11 07:45:45 +00:00
24e5d61af8 Log usage of optimizer in backward (#110206)
This will allow us to inspect and aggregate jobs that use optimizer in
backward

Differential Revision: [D48674740](https://our.internmc.facebook.com/intern/diff/D48674740/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110206
Approved by: https://github.com/awgu
2023-09-29 11:00:07 +00:00
5d4e170d58 [Optim in backward] API to retrieve in-backward optimizers (#105991)
API to retrieve in backward optimizer for checkpointing purposes

Differential Revision: [D47782225](https://our.internmc.facebook.com/intern/diff/D47782225/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105991
Approved by: https://github.com/awgu
2023-07-29 01:36:25 +00:00
8e6287264d [Optim in backward] register_hook=False API (#95096)
Use this API to avoid registering hooks for applications that do their
own custom logic. This eliminates the need for DDP to have to de-register these
hooks.

Differential Revision: [D43383794](https://our.internmc.facebook.com/intern/diff/D43383794/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D43383794/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95096
Approved by: https://github.com/zhaojuanmao
2023-03-15 14:33:13 +00:00
745fe35df5 [follow-up] Python Attr Serialization (#88913)
Ref: https://github.com/pytorch/pytorch/pull/81616#issuecomment-1307595402
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88913
Approved by: https://github.com/albanD
2023-01-13 17:38:51 +00:00
e8bf7c21e4 Integrate apply_optim_in_backward with DDP (#89194)
Allow _apply_optim_in_backward to work with DDP.

Example:

```
dist.init_process_group("nccl", rank=rank, world_size=2)
    torch.cuda.set_device(rank)
    e = enc().cuda(rank)
    _apply_optimizer_in_backward(
        optimizer_class=torch.optim.SGD,
        params=e.parameters(),
        optimizer_kwargs={"lr": 0.03},
    )
    e = DDP(e, device_ids=[rank])
    inp = torch.randn(1, 10, device=rank)
    e(inp).sum().backward()
```

Constraints:

1. Custom communication hook not yet supported
2. _apply_optim_in_backward needs to be called _before_ wrapping model in DDP.
3. DDP will remove the gradient hooks _apply_optim_in_backward registers, so these gradient hooks will not be fired and cannot be used.
4. All DDP managed parameters have grads set to None by default once optimizer is applied. There is no support for setting only some parameter grads to None, this must be done manually by user (and DDP_OVERLAPPED_OPTIM_SET_GRADS_TO_NONE=0 needs to be set.)

Differential Revision: [D41329694](https://our.internmc.facebook.com/intern/diff/D41329694/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D41329694/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89194
Approved by: https://github.com/zhaojuanmao
2022-12-21 07:35:19 +00:00
1a48ae96ba [PT-D][Easy] Reformat the optim code within PTD code base (#90399)
Just run two commands:
```
ufmt format torch/distributed/optim/
ufmt format test/distributed/optim/
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90399
Approved by: https://github.com/awgu
2022-12-08 06:38:59 +00:00
404f254e20 Upstream apply_optim_in_backward from TorchRec (#87397) (#88539)
Summary:

Upstreaming this as part of sharing common APIs. This is just a plain
move, any changes needed to support DDP / FSDP will come in follow up diffs.

Test Plan: CI

Reviewed By: zhaojuanmao

Differential Revision: D40564646

fbshipit-source-id: 619c434e02196812f8d4db1e40d07290e08b18f9
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88539
Approved by: https://github.com/awgu
2022-11-05 18:28:07 +00:00