27 Commits

Author SHA1 Message Date
d4a713cd9c Change forkserver test to only run below 3.13.8 (#165667)
A multiprocessing bug is fixed in 3.13.8, see [https://docs.python.org/3.13/whatsnew/changelog.html](https://l.workplace.com/l.php?u=https%3A%2F%2Fdocs.python.org%2F3.13%2Fwhatsnew%2Fchangelog.html&h=AT0qUhHJq5c2UJvQaq9_MrSo0mVhwn1VOfq1nDQl2C1UOhDI80RMbzVayhG7LSAT1uYHKtkftKnBDwiGMhbw0YRvQLe5vwE01qejpPFautHvU3LXeOE1KChPykqz3qnCRzk7czu_iNzQ05shR4F1N_qYOzR5YxejA52ZZQ), [gh-126631](https://github.com/python/cpython/issues/126631)

So this test will fail when we update to python 3.13.8
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165667
Approved by: https://github.com/malfet
2025-10-16 19:34:10 +00:00
dc194a3096 Test multiprocessing spawn timing fix (#160672)
Submitting PR to fix #160511.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160672
Approved by: https://github.com/mikaylagawarecki
2025-08-15 00:11:55 +00:00
f5e2de928b [BE] fix remaining flake8 v7 warnings (#159044)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159044
Approved by: https://github.com/Skylion007
ghstack dependencies: #159043
2025-07-25 02:56:34 +00:00
cyy
b0dfd242fa Remove NO_MULTIPROCESSING_SPAWN checks (#146705)
py 3.9 has spawn.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146705
Approved by: https://github.com/colesbury
2025-02-28 05:53:19 +00:00
926b7b5027 Revert "Remove NO_MULTIPROCESSING_SPAWN checks (#146705)"
This reverts commit 40ad5e01dff05c7d64e070fb01683820e678f788.

Reverted https://github.com/pytorch/pytorch/pull/146705 on behalf of https://github.com/cyyever due to Broke lint?, I guess land race with rufff update ([comment](https://github.com/pytorch/pytorch/pull/146705#issuecomment-2689603077))
2025-02-28 03:04:38 +00:00
40ad5e01df Remove NO_MULTIPROCESSING_SPAWN checks (#146705)
py 3.9 has spawn.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146705
Approved by: https://github.com/colesbury
2025-02-28 00:15:32 +00:00
b77406a9ec [BE][CI] bump ruff to 0.8.4 (#143753)
Changes:

1. Bump `ruff` from 0.7.4 to 0.8.4
2. Change `%`-formatted strings to f-string
3. Change arguments with the `__`-prefix to positional-only arguments with the `/` separator in function signature.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143753
Approved by: https://github.com/Skylion007
2024-12-24 12:24:10 +00:00
d8c8ba2440 Fix unused Python variables in test/[e-z]* (#136964)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136964
Approved by: https://github.com/justinchuby, https://github.com/albanD
2024-12-18 23:02:30 +00:00
8c3ab21490 multiprocessing.spawn: allow a grace period when shutdown (#131278)
When one process fails, others are immediately killed. This prevents other processes to do necessary cleanups, or dump debug information (in particular, the NCCL flight recorder).

This PR adds a grace period. Default behavior is unchanged.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131278
Approved by: https://github.com/albanD
2024-10-07 12:37:34 +00:00
20b62fed21 Create processes in parallel in mp.start_processes for forkserver (#134629)
Summary:
This is to fix the pytorch issue filed https://github.com/pytorch/pytorch/issues/133010
one way to fix this problem is to enable parallel start processes in mp.start_processes.
What else in the diff:
refactored a test case api_test which was repeating a lot of tests due to the inheritance.
added unit test for forkserver when parallel start is on.

Test Plan: Added unit tests

Differential Revision: D61878552

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134629
Approved by: https://github.com/d4l3k
2024-08-28 21:34:32 +00:00
adcce538b7 Revert "Allow mp.start_processes to create processes in parallel (#133707)"
This reverts commit 3546628a2a167ace6060737eeccf8ee8fd87ddc0.

Reverted https://github.com/pytorch/pytorch/pull/133707 on behalf of https://github.com/ZainRizvi due to sorry but trunk has been consistently broken since this PR was merged. See: [GH job link](https://github.com/pytorch/pytorch/actions/runs/10529617600/job/29191757055) [HUD commit link](3546628a2a) ([comment](https://github.com/pytorch/pytorch/pull/133707#issuecomment-2310709523))
2024-08-26 17:31:10 +00:00
3546628a2a Allow mp.start_processes to create processes in parallel (#133707)
Summary:
Background discussion in https://fb.workplace.com/groups/319878845696681/posts/1226087421742481

and pytorch issue filed https://github.com/pytorch/pytorch/issues/133010

one way to fix this problem is to add an option to parallel start processes on pytorch side.

Test Plan: Tested aps run in problem and things are in parallel now (next diff)

Differential Revision: D61301603

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133707
Approved by: https://github.com/d4l3k, https://github.com/ezyang
2024-08-23 17:11:20 +00:00
91b848bf81 Revert "markDynamoStrictTest on more tests (#115879)"
This reverts commit 8b650cdd3cdd1174b399f312ec2f7955551a2f5d.

Reverted https://github.com/pytorch/pytorch/pull/115879 on behalf of https://github.com/atalman due to OSSCI oncall, broke inductor ([comment](https://github.com/pytorch/pytorch/pull/115879#issuecomment-1858418921))
2023-12-15 20:00:09 +00:00
8b650cdd3c markDynamoStrictTest on more tests (#115879)
Featuring:
test_mobile_optimizer.py
test_module_init.py
test_modules.py
test_multiprocessing.py
test_multiprocessing_spawn.py
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115879
Approved by: https://github.com/voznesenskym
ghstack dependencies: #115845, #115855, #115856, #115857, #115858, #115870, #115871
2023-12-15 13:19:52 +00:00
8fce9a09cd [BE]: pyupgrade Python to 3.8 - imports and object inheritance only (#94308)
Apply parts of pyupgrade to torch (starting with the safest changes).
This PR only does two things: removes the need to inherit from object and removes unused future imports.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94308
Approved by: https://github.com/ezyang, https://github.com/albanD
2023-02-07 21:10:56 +00:00
634427d65c Make test_multiprocessing_spawn.py compatible with pytest (#50408)
Summary:
This file is currently failing with

```
file /home/gaoxiang/pytorch-tf32/test/test_multiprocessing_spawn.py, line 13
  def test_success_func(i):
E       fixture 'i' not found
>       available fixtures: cache, capfd, capfdbinary, caplog, capsys, capsysbinary, doctest_namespace, include_metadata_in_junit_xml, json_metadata, metadata, monkeypatch, pytestconfig, record_property, record_testsuite_property, record_xml_attribute, recwarn, tmp_path, tmp_path_factory, tmpdir, tmpdir_factory
>       use 'pytest --fixtures [testpath]' for help on them.

/home/gaoxiang/pytorch-tf32/test/test_multiprocessing_spawn.py:13
________________________________________________________________________________________________________________ ERROR at setup of test_success_single_arg_func ________________________________________________________________________________________________________________
file /home/gaoxiang/pytorch-tf32/test/test_multiprocessing_spawn.py, line 17
  def test_success_single_arg_func(i, arg):
E       fixture 'i' not found
>       available fixtures: cache, capfd, capfdbinary, caplog, capsys, capsysbinary, doctest_namespace, include_metadata_in_junit_xml, json_metadata, metadata, monkeypatch, pytestconfig, record_property, record_testsuite_property, record_xml_attribute, recwarn, tmp_path, tmp_path_factory, tmpdir, tmpdir_factory
>       use 'pytest --fixtures [testpath]' for help on them.

/home/gaoxiang/pytorch-tf32/test/test_multiprocessing_spawn.py:17
_________________________________________________________________________________________________________________ ERROR at setup of test_exception_single_func _________________________________________________________________________________________________________________
file /home/gaoxiang/pytorch-tf32/test/test_multiprocessing_spawn.py, line 22
  def test_exception_single_func(i, arg):
E       fixture 'i' not found
>       available fixtures: cache, capfd, capfdbinary, caplog, capsys, capsysbinary, doctest_namespace, include_metadata_in_junit_xml, json_metadata, metadata, monkeypatch, pytestconfig, record_property, record_testsuite_property, record_xml_attribute, recwarn, tmp_path, tmp_path_factory, tmpdir, tmpdir_factory
>       use 'pytest --fixtures [testpath]' for help on them.

/home/gaoxiang/pytorch-tf32/test/test_multiprocessing_spawn.py:22
__________________________________________________________________________________________________________________ ERROR at setup of test_exception_all_func ___________________________________________________________________________________________________________________
file /home/gaoxiang/pytorch-tf32/test/test_multiprocessing_spawn.py, line 28
  def test_exception_all_func(i):
E       fixture 'i' not found
>       available fixtures: cache, capfd, capfdbinary, caplog, capsys, capsysbinary, doctest_namespace, include_metadata_in_junit_xml, json_metadata, metadata, monkeypatch, pytestconfig, record_property, record_testsuite_property, record_xml_attribute, recwarn, tmp_path, tmp_path_factory, tmpdir, tmpdir_factory
>       use 'pytest --fixtures [testpath]' for help on them.

/home/gaoxiang/pytorch-tf32/test/test_multiprocessing_spawn.py:28
_________________________________________________________________________________________________________________ ERROR at setup of test_terminate_signal_func _________________________________________________________________________________________________________________
file /home/gaoxiang/pytorch-tf32/test/test_multiprocessing_spawn.py, line 33
  def test_terminate_signal_func(i):
E       fixture 'i' not found
>       available fixtures: cache, capfd, capfdbinary, caplog, capsys, capsysbinary, doctest_namespace, include_metadata_in_junit_xml, json_metadata, metadata, monkeypatch, pytestconfig, record_property, record_testsuite_property, record_xml_attribute, recwarn, tmp_path, tmp_path_factory, tmpdir, tmpdir_factory
>       use 'pytest --fixtures [testpath]' for help on them.

/home/gaoxiang/pytorch-tf32/test/test_multiprocessing_spawn.py:33
__________________________________________________________________________________________________________________ ERROR at setup of test_terminate_exit_func __________________________________________________________________________________________________________________
file /home/gaoxiang/pytorch-tf32/test/test_multiprocessing_spawn.py, line 39
  def test_terminate_exit_func(i, arg):
E       fixture 'i' not found
>       available fixtures: cache, capfd, capfdbinary, caplog, capsys, capsysbinary, doctest_namespace, include_metadata_in_junit_xml, json_metadata, metadata, monkeypatch, pytestconfig, record_property, record_testsuite_property, record_xml_attribute, recwarn, tmp_path, tmp_path_factory, tmpdir, tmpdir_factory
>       use 'pytest --fixtures [testpath]' for help on them.

/home/gaoxiang/pytorch-tf32/test/test_multiprocessing_spawn.py:39
___________________________________________________________________________________________________________ ERROR at setup of test_success_first_then_exception_func ___________________________________________________________________________________________________________
file /home/gaoxiang/pytorch-tf32/test/test_multiprocessing_spawn.py, line 45
  def test_success_first_then_exception_func(i, arg):
E       fixture 'i' not found
>       available fixtures: cache, capfd, capfdbinary, caplog, capsys, capsysbinary, doctest_namespace, include_metadata_in_junit_xml, json_metadata, metadata, monkeypatch, pytestconfig, record_property, record_testsuite_property, record_xml_attribute, recwarn, tmp_path, tmp_path_factory, tmpdir, tmpdir_factory
>       use 'pytest --fixtures [testpath]' for help on them.

/home/gaoxiang/pytorch-tf32/test/test_multiprocessing_spawn.py:45
___________________________________________________________________________________________________________________ ERROR at setup of test_nested_child_body ___________________________________________________________________________________________________________________
file /home/gaoxiang/pytorch-tf32/test/test_multiprocessing_spawn.py, line 52
  def test_nested_child_body(i, ready_queue, nested_child_sleep):
E       fixture 'i' not found
>       available fixtures: cache, capfd, capfdbinary, caplog, capsys, capsysbinary, doctest_namespace, include_metadata_in_junit_xml, json_metadata, metadata, monkeypatch, pytestconfig, record_property, record_testsuite_property, record_xml_attribute, recwarn, tmp_path, tmp_path_factory, tmpdir, tmpdir_factory
>       use 'pytest --fixtures [testpath]' for help on them.

/home/gaoxiang/pytorch-tf32/test/test_multiprocessing_spawn.py:52
_____________________________________________________________________________________________________________________ ERROR at setup of test_infinite_task _____________________________________________________________________________________________________________________
file /home/gaoxiang/pytorch-tf32/test/test_multiprocessing_spawn.py, line 57
  def test_infinite_task(i):
E       fixture 'i' not found
>       available fixtures: cache, capfd, capfdbinary, caplog, capsys, capsysbinary, doctest_namespace, include_metadata_in_junit_xml, json_metadata, metadata, monkeypatch, pytestconfig, record_property, record_testsuite_property, record_xml_attribute, recwarn, tmp_path, tmp_path_factory, tmpdir, tmpdir_factory
>       use 'pytest --fixtures [testpath]' for help on them.

/home/gaoxiang/pytorch-tf32/test/test_multiprocessing_spawn.py:57
_____________________________________________________________________________________________________________________ ERROR at setup of test_process_exit ______________________________________________________________________________________________________________________
file /home/gaoxiang/pytorch-tf32/test/test_multiprocessing_spawn.py, line 62
  def test_process_exit(idx):
E       fixture 'idx' not found
>       available fixtures: cache, capfd, capfdbinary, caplog, capsys, capsysbinary, doctest_namespace, include_metadata_in_junit_xml, json_metadata, metadata, monkeypatch, pytestconfig, record_property, record_testsuite_property, record_xml_attribute, recwarn, tmp_path, tmp_path_factory, tmpdir, tmpdir_factory
>       use 'pytest --fixtures [testpath]' for help on them.

/home/gaoxiang/pytorch-tf32/test/test_multiprocessing_spawn.py:62
________________________________________________________________________________________________________________________ ERROR at setup of test_nested _________________________________________________________________________________________________________________________
file /home/gaoxiang/pytorch-tf32/test/test_multiprocessing_spawn.py, line 66
  def test_nested(i, pids_queue, nested_child_sleep, start_method):
E       fixture 'i' not found
>       available fixtures: cache, capfd, capfdbinary, caplog, capsys, capsysbinary, doctest_namespace, include_metadata_in_junit_xml, json_metadata, metadata, monkeypatch, pytestconfig, record_property, record_testsuite_property, record_xml_attribute, recwarn, tmp_path, tmp_path_factory, tmpdir, tmpdir_factory
>       use 'pytest --fixtures [testpath]' for help on them.
```
when running with pytest. This is because pytest considers anything starting with `test_` as a test, so I renamed it to `_test_...` to prevent this from happening.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50408

Reviewed By: bdhirsh

Differential Revision: D34118341

Pulled By: VitalyFedyunin

fbshipit-source-id: 7c74843462b79df351e3c60f313ef388a9e0df4e
(cherry picked from commit fd8b66bea0e2c182db0c77cb0c516822559b3cc1)
2022-02-10 16:18:46 +00:00
14d3d29b16 make ProcessException pickleable (#70118)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/70116

Happy to add tests if you let me know the best place to put them.

cc VitalyFedyunin

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70118

Reviewed By: malfet

Differential Revision: D33255899

Pulled By: ejguan

fbshipit-source-id: 41d495374182eb28bb8bb421e890eca3bddc077b
2021-12-30 09:09:55 -08:00
00a871c5c9 [skip ci] Set test owner for multiprocessing tests (#66848)
Summary:
Action following https://github.com/pytorch/pytorch/issues/66232

cc VitalyFedyunin

Pull Request resolved: https://github.com/pytorch/pytorch/pull/66848

Reviewed By: VitalyFedyunin

Differential Revision: D31828908

Pulled By: janeyx99

fbshipit-source-id: 45d6901648f5564c1bf07ad8d01d69ef486ae104
2021-10-21 13:13:53 -07:00
1022443168 Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default
Test Plan: revert-hammer

Differential Revision:
D30279364 (b004307252)

Original commit changeset: c1ed77dfe43a

fbshipit-source-id: eab50857675c51e0088391af06ec0ecb14e2347e
2021-08-12 11:45:01 -07:00
b004307252 [codemod][lint][fbcode/c*] Enable BLACK by default
Test Plan: manual inspection & sandcastle

Reviewed By: zertosh

Differential Revision: D30279364

fbshipit-source-id: c1ed77dfe43a3bde358f92737cd5535ae5d13c9a
2021-08-12 10:58:35 -07:00
3ffd2af8cd Add exception classification to torch.multiprocessing.spawn (#45174)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45174

Introduce different types of exceptions that map to different failures
of torch.multiprocessing.spawn. The change introduces three different exception types:
ProcessRaisedException - occurs when the process initiated by spawn raises an exception
ProcessExitedException - occurs when the process initiated by spawn exits
The following logic will allow frameworks that use mp.spawn to categorize failures.
This can be helpful for tracking metrics and enhancing logs.

Test Plan: Imported from OSS

Reviewed By: taohe

Differential Revision: D23889400

Pulled By: tierex

fbshipit-source-id: 8849624c616230a6a81158c52ce0c18beb437330
2020-10-09 12:59:41 -07:00
20ac736200 Remove py2 compatible future imports (#44735)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44735

Reviewed By: mruberry

Differential Revision: D23731306

Pulled By: ezyang

fbshipit-source-id: 0ba009a99e475ddbe22981be8ac636f8a1c8b02f
2020-09-16 12:55:57 -07:00
e75fb4356b Remove (most) Python 2 support from Python code (#35615)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35615

Python 2 has reached end-of-life and is no longer supported by PyTorch.
Now we can clean up a lot of cruft that we put in place to support it.
These changes were all done manually, and I skipped anything that seemed
like it would take more than a few seconds, so I think it makes sense to
review it manually as well (though using side-by-side view and ignoring
whitespace change might be helpful).

Test Plan: CI

Differential Revision: D20842886

Pulled By: dreiss

fbshipit-source-id: 8cad4e87c45895e7ce3938a88e61157a79504aed
2020-04-22 09:23:14 -07:00
f050b16dd9 Move pytorch distributed tests to separate folder for contbuild. (#30445)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30445

Create distributed and rpc directories under caffe/test for better management
of unit tests.

Differential Revision: D18702786

fbshipit-source-id: e9daeed0cfb846ef68806f6decfcb57c0e0e3606
2020-01-22 21:16:59 -08:00
a997f224ac Add torch.multiprocessing.create_processes
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28493

Differential Revision: D18766066

Pulled By: ailzhang

fbshipit-source-id: 7f424c8fae3012be2416cf9bc72ee2dde40c1f89
2019-12-03 10:38:19 -08:00
220ce8046e Binding for prctl(PR_SET_PDEATHSIG) (#14491)
Summary:
If torch.multiprocessing.spawn is used to launch non-daemonic
processes (the default since #14391), the spawned children won't be
automatically terminated when the parent terminates.

On Linux, we can address this by setting PR_SET_PDEATHSIG, which
delivers a configurable signal to child processes when their parent
terminates.

Fixes #14394.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14491

Differential Revision: D13270374

Pulled By: pietern

fbshipit-source-id: 092c9d3c3cea2622c3766b467957bc27a1bd500c
2018-11-29 20:09:19 -08:00
be424de869 Add torch.multiprocessing.spawn helper (#13518)
Summary:
This helper addresses a common pattern where one spawns N processes to
work on some common task (e.g. parallel preprocessing or multiple
training loops).

A straightforward approach is to use the multiprocessing API directly
and then consecutively call join on the resulting processes.

This pattern breaks down in the face of errors. If one of the
processes terminates with an exception or via some signal, and it is
not the first process that was launched, the join call on the first
process won't be affected. This helper seeks to solve this by waiting
on termination from any of the spawned processes. When any process
terminates with a non-zero exit status, it terminates the remaining
processes, and raises an exception in the parent process. If the
process terminated with an exception, it is propagated to the parent.
If the process terminated via a signal (e.g. SIGINT, SIGSEGV), this is
mentioned in the exception as well.

Requires Python >= 3.4.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13518

Reviewed By: orionr

Differential Revision: D12929045

Pulled By: pietern

fbshipit-source-id: 00df19fa16a568d1e22f37a2ba65677ab0cce3fd
2018-11-06 14:08:37 -08:00