Commit Graph

5 Commits

Author SHA1 Message Date
94dc3253a0 [BE][Easy] enable UFMT for torch/distributed/ (#128870)
Part of #123062

- #123062

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128870
Approved by: https://github.com/fegin, https://github.com/wconstab
2024-06-22 18:53:28 +00:00
9c929f6ce9 Revert "[BE][Easy] enable UFMT for torch/distributed/ (#128870)"
This reverts commit a0e1e20c4157bb3e537fc784a51d7aef1e754157.

Reverted https://github.com/pytorch/pytorch/pull/128870 on behalf of https://github.com/fbgheith due to breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/128870#issuecomment-2181780356))
2024-06-21 00:38:28 +00:00
a0e1e20c41 [BE][Easy] enable UFMT for torch/distributed/ (#128870)
Part of #123062

- #123062

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128870
Approved by: https://github.com/fegin
ghstack dependencies: #128868, #128869
2024-06-18 21:49:08 +00:00
0cf572ff6c [C10D][BE] Add exception handlers to c10d collectives function (#87643) (#87988)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87643

1. Add a decorator function exception_handlers to  c10d collectives.
2. Update test(torch/distributed/distributed_c10d.py) to include mp tests for exception_handler.

```
python3 test/distributed/test_c10d_error_logger.py
```

Test Plan: Test in OSS.

Reviewed By: H-Huang

Differential Revision: D40281632

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87988
Approved by: https://github.com/H-Huang
2022-10-29 04:38:34 +00:00
7bd04fb09f [1/N][C10D] Add a customized ScubaLogHandler implementation for internal FB use (#86699) (#87123)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86699

This diff does the following:
1. **c10d_error_logger.py**: Add an API  to create a logger with a specific logging handler based on the destination.
2. The API from above would get a logging handler based on the destination provided.
-  **caffe2/torch/distributed/logging_handlers.py**: For OSS, we simply use a NullHandler() for now.
3. Add associated test files for 1 and 2.

Test Plan:
## Unit Test
```
buck test @//mode/dev-nosan //caffe2/test/distributed:test_c10d_error_logger -- --print-passing-details
```
```
File changed: fbcode//caffe2/test/distributed/test_c10d_error_logger.py
File changed: fbsource//xplat/caffe2/test/distributed/TARGETS
9 additional file changes
waiting for all tests to finish...
✓ Listing success: caffe2/test/distributed:test_c10d_error_logger (0.2s)
Found 1 tests
✓ Pass: caffe2/test/distributed:test_c10d_error_logger - test_get_or_create_logger (caffe2.test.distributed.test_c10d_error_logger.C10dErrorLoggerTest) (0.2s)

stdout:

stderr:

Buck UI:      https://www.internalfb.com/buck2/b975f6b0-77e9-4287-8722-f95b48036181
Test Session: https://www.internalfb.com/intern/testinfra/testrun/1407375150206593
RE: reSessionID-4d7ab8ca-1051-48e9-a5a8-6edbe15d1fe4  Up: 124 B  Down: 0 B
Jobs completed: 5. Time elapsed: 3.5s.
Tests finished: Pass 1. Fail 0. Fatal 0. Skip 0. 0 builds failed
```

Differential Revision: D39920391

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87123
Approved by: https://github.com/fduwjj, https://github.com/H-Huang
2022-10-21 18:45:38 +00:00