suo
fe8cc619b8
[torch][c10d] fix split_group in mixed backend case ( #162424 )
...
Today we can initialize a mixed-backend process group (e.g. "cpu:gloo,cuda:nccl") but we can only pass one set of process group options.
However, when we call `split_group`, we retrieve that set of options from the parent PG and pass it to the ProcessGroup::groupSplit C++ API, which then attempts to propagate that set of options to all backends.
This leads to an assert on some user code, where ProcessGroupGloo::split is expecting gloo options but receives nccl options instead.
Arguably the APIs as currently designed are just broken; we should not ever expect a single set of backend options to apply across multiple backends. However, fixing this would require changing quite a few public APIs.
As a quick fix, since user-provided options really only exist for NCCL, just warn and fall-back to defaulted options for Gloo if non-gloo options are detected.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162424
Approved by: https://github.com/d4l3k , https://github.com/fduwjj , https://github.com/H-Huang
2025-09-11 16:29:32 +00:00
..
2025-08-28 22:20:55 +00:00
2024-12-13 22:13:12 +00:00
2025-07-17 12:08:33 +00:00
2024-12-13 22:13:12 +00:00
2025-06-21 18:33:38 +00:00
2025-09-11 00:35:57 +00:00
2025-06-24 04:53:54 +00:00
2025-08-30 13:21:28 +00:00
2025-03-29 01:39:13 +00:00
2025-07-25 02:37:30 +00:00
2025-09-11 16:29:32 +00:00
2025-09-11 06:44:26 +00:00
2025-09-10 18:52:15 +00:00
2025-09-10 04:39:20 +00:00
2025-09-10 04:39:20 +00:00
2025-09-02 19:04:11 +00:00
2025-09-11 14:56:26 +00:00
2025-07-22 22:25:44 +00:00
2025-09-09 23:52:29 +00:00
2025-09-10 21:23:42 +00:00
2025-09-04 18:52:01 +00:00
2025-09-11 06:44:26 +00:00
2025-09-10 04:39:20 +00:00
2025-01-27 18:12:39 +00:00
2025-08-28 08:42:34 +00:00
2025-03-29 01:39:13 +00:00
2025-07-29 03:26:09 +00:00
2025-07-29 03:26:09 +00:00
2025-09-10 21:39:35 +00:00
2025-09-02 22:45:55 +00:00
2025-08-29 11:45:11 +00:00
2025-08-06 02:26:10 +00:00
2025-08-30 06:55:43 +00:00
2025-09-10 15:15:49 +00:00
2025-07-29 03:26:09 +00:00
2025-02-22 03:44:53 +00:00
2025-07-29 03:26:09 +00:00
2025-09-02 16:53:55 +00:00
2025-08-02 05:16:01 +00:00
2025-07-26 01:22:17 +00:00
2025-08-03 20:53:58 +00:00
2025-06-17 17:51:40 +00:00
2024-11-04 18:30:29 +00:00
2025-09-09 16:34:01 +00:00
2025-05-12 18:30:52 +00:00
2025-09-08 13:33:32 +00:00
2025-08-08 17:41:22 +00:00
2025-06-10 18:33:09 +00:00
2025-04-26 18:10:58 +00:00
2025-02-07 06:06:18 +00:00
2025-07-30 19:30:55 +00:00
2025-09-02 03:26:31 +00:00
2025-08-19 22:57:45 +00:00
2025-06-04 14:38:13 +00:00
2025-01-04 10:47:51 +00:00
2024-11-22 20:54:55 +00:00
2025-09-09 15:49:21 +00:00
2025-08-04 20:37:39 +00:00
2025-07-09 11:24:27 +00:00
2025-08-19 22:13:47 +00:00
2025-04-10 21:02:14 +00:00
2025-04-25 20:15:04 +00:00
2025-01-04 14:17:20 +00:00
2025-09-09 15:49:21 +00:00
2025-07-09 11:02:23 +00:00
2025-07-09 11:02:23 +00:00
2025-01-04 10:47:51 +00:00
2024-12-18 23:02:30 +00:00
2025-09-11 06:03:25 +00:00
2025-08-13 21:00:59 +00:00
2025-09-10 07:05:14 +00:00
2025-07-09 11:02:23 +00:00
2025-07-25 20:21:36 +00:00
2025-06-14 03:37:38 +00:00
2025-08-05 18:57:35 +00:00
2025-08-22 20:48:46 +00:00
2025-08-04 20:37:39 +00:00
2025-01-23 00:31:39 +00:00
2025-09-09 15:49:21 +00:00
2024-12-18 23:02:30 +00:00
2025-05-25 17:36:14 +00:00
2025-08-10 07:05:52 +00:00
2024-12-18 23:02:30 +00:00
2025-02-08 00:55:20 +00:00
2025-01-04 10:47:51 +00:00
2025-07-09 11:02:23 +00:00
2025-07-09 11:02:23 +00:00
2025-08-13 05:50:15 +00:00
2025-07-09 11:02:23 +00:00
2025-07-10 06:34:46 +00:00
2025-08-07 23:43:53 +00:00
2025-02-28 00:47:03 +00:00
2025-06-04 14:38:13 +00:00
2025-08-27 15:38:11 +00:00
2025-08-04 20:37:39 +00:00
2025-08-04 20:37:39 +00:00
2025-08-09 02:21:22 +00:00
2025-08-04 20:37:39 +00:00
2025-08-04 20:37:39 +00:00
2024-12-18 23:02:30 +00:00
2025-01-22 04:48:28 +00:00
2025-08-04 20:37:39 +00:00
2024-12-18 23:02:30 +00:00
2025-07-09 11:02:23 +00:00
2025-07-09 11:02:23 +00:00
2025-09-04 16:10:03 +00:00
2025-07-09 11:02:23 +00:00
2025-07-21 21:44:49 +00:00
2025-09-11 03:34:07 +00:00
2025-08-26 12:32:24 +00:00
2024-12-18 23:02:30 +00:00
2025-01-04 10:47:51 +00:00
2025-07-17 08:57:34 +00:00
2024-12-18 23:02:30 +00:00
2025-09-10 14:19:34 +00:00
2024-12-18 23:02:30 +00:00
2025-01-25 00:58:03 +00:00
2024-12-18 23:02:30 +00:00
2025-09-10 15:15:49 +00:00
2025-08-15 00:11:55 +00:00
2025-06-12 14:42:32 +00:00
2024-12-18 23:02:30 +00:00
2025-01-04 10:47:51 +00:00
2024-12-06 21:45:18 +00:00
2025-09-08 22:44:48 +00:00
2025-09-10 15:15:49 +00:00
2025-01-04 10:47:51 +00:00
2025-09-10 04:29:42 +00:00
2025-04-27 09:56:42 +00:00
2025-07-25 23:49:46 +00:00
2024-12-18 23:02:30 +00:00
2025-08-25 08:03:27 +00:00
2025-03-18 16:09:39 +00:00
2025-07-09 11:02:23 +00:00
2025-08-07 02:38:45 +00:00
2025-08-24 08:03:04 +00:00
2024-12-27 07:58:44 +00:00
2025-09-02 16:22:42 +00:00
2025-07-09 11:02:23 +00:00
2025-09-10 00:17:15 +00:00
2025-08-22 20:48:46 +00:00
2025-07-09 11:02:23 +00:00
2025-09-08 22:59:13 +00:00
2025-09-05 20:15:29 +00:00
2025-07-11 03:21:47 +00:00
2025-07-09 11:02:23 +00:00
2025-07-17 01:27:44 +00:00
2025-07-09 11:02:23 +00:00
2025-08-10 18:35:42 +00:00
2025-09-09 15:49:21 +00:00
2025-08-12 20:52:25 +00:00
2025-02-25 03:47:40 +00:00
2025-08-14 17:06:27 +00:00
2025-09-09 15:49:21 +00:00
2025-07-25 02:56:34 +00:00
2025-09-11 07:52:05 +00:00
2025-06-04 01:58:52 +00:00
2025-07-09 11:02:23 +00:00
2025-01-04 10:47:51 +00:00
2024-12-18 23:02:30 +00:00
2025-07-09 11:02:23 +00:00
2025-09-08 15:58:58 +00:00
2025-05-30 19:18:43 +00:00
2025-07-09 11:02:23 +00:00
2025-08-19 00:54:51 +00:00
2025-01-26 03:37:20 +00:00
2025-08-21 00:42:55 +00:00
2025-07-15 08:10:05 +00:00
2025-09-08 20:33:23 +00:00
2025-01-04 14:17:20 +00:00
2025-07-20 23:49:18 +00:00
2025-07-09 11:02:23 +00:00
2025-08-10 18:35:42 +00:00
2025-02-05 19:40:10 +00:00
2024-12-12 01:18:34 +00:00
2025-09-02 16:10:30 +00:00
2025-08-16 00:54:32 +00:00
2024-12-18 23:02:30 +00:00
2025-07-09 11:02:23 +00:00
2025-02-04 19:07:04 +00:00
2025-09-02 06:41:32 +00:00