fmt10.1.0 fixes a bug of format_string_checker initialisation order which is important to our improved clang-tidy checks #103058. This PR upgrades third_party fmt to 10.1.0, in the meanwhile, kineto is also upgraded to avoid fmt errors.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106672
Approved by: https://github.com/Skylion007
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73373
This PR allows for newly joined ranks in Dynamic RPC to communicate with ranks that have already joined the group. That is, rank N will be able to run RPC against all ranks <= N.
Previously:
Process 1 (init):
```python
init_rpc("worker0", rank=0)
```
Process2 (command against a rank that already joined, would fail):
```python
init_rpc("worker1", rank=1)
rpc.sync("worker0", torch.add, (torch.tensor(1), torch.tensor(1)))
```
Now:
Above scenario succeeds
Test:
`pytest test/distributed/rpc/test_tensorpipe_agent.py -vsk test_init_rpc_without_world_size`
Test Plan: Imported from OSS
Reviewed By: mrshenli
Differential Revision: D35052544
Pulled By: H-Huang
fbshipit-source-id: dba48b216731c27730e7d46aefd9e7191c792170
(cherry picked from commit f3c42d8482c933fd746d4da8e64fa40cdf92a221)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65946
Add new function in agent_utils to perform a synchronization of active call counts using store. This is intended to replace the barrier and all_reduce used by the process group in RPC shutdown.
`test_ddp_comparison` and `test_ddp_comparison_uneven_inputs` test fail with these changes. It seems like the RPC agents are not accessing the same store, so the total count of processes never reaches the world size to exit the barrier, still ened to investigate why it is like this only for these test cases. Setting clean_shutdown to false ignores this code path which allows the test to pass.
cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang
Test Plan: Imported from OSS
Reviewed By: jbschlosser
Differential Revision: D31762736
Pulled By: H-Huang
fbshipit-source-id: cb5d0efe196f72726c63393c4293e97ec4f18548