torch.compile: Increase subprocess parent death check interval to lower cpu (#164594)

Summary: This check is a good idea (we could potentially do it with prctl). However we're seeing elevated rates of cpu usage in idle worker threads. This causes issues on production jobs, causing a large amount of spikeness in qps. Test Plan: Tested on a prod job with caches force disabled via TORCH_COMPILE_FORCE_DISABLE_CACHES=1 Baseline <img width="454" height="403" alt="image" src="https://github.com/user-attachments/assets/b88583a1-5b99-48cb-b03d-cd9b69546579" /> With this diff - <img width="426" height="403" alt="image" src="https://github.com/user-attachments/assets/431217f1-0ed0-4f6e-9d81-6428bf34e0e3" /> Differential Revision: D83803302 Pull Request resolved: https://github.com/pytorch/pytorch/pull/164594 Approved by: https://github.com/masnesral
2025-10-20 21:14:14 +08:00 · 2025-10-06 18:15:21 +00:00
parent 4a6abba0d9
commit ba480d6bf7
1 changed files with 1 additions and 1 deletions
--- a/torch/_inductor/compile_worker/utils.py
+++ b/torch/_inductor/compile_worker/utils.py
@ -27,7 +27,7 @@ def _async_compile_initializer(orig_ppid: int) -> None:

    def run() -> None:
        while True:
-            sleep(1)
+            sleep(60)
            if orig_ppid != os.getppid():
                os.kill(os.getpid(), signal.SIGKILL)