torch.compile: Increase subprocess parent death check interval to lower cpu (#164594)

Summary:
This check is a good idea (we could potentially do it with prctl). However
we're seeing elevated rates of cpu usage in idle worker threads. This causes issues on production jobs, causing a large amount of spikeness in qps.

Test Plan:
Tested on a prod job with caches force disabled via
TORCH_COMPILE_FORCE_DISABLE_CACHES=1

Baseline
<img width="454" height="403" alt="image" src="https://github.com/user-attachments/assets/b88583a1-5b99-48cb-b03d-cd9b69546579" />

With this diff -
<img width="426" height="403" alt="image" src="https://github.com/user-attachments/assets/431217f1-0ed0-4f6e-9d81-6428bf34e0e3" />

Differential Revision: D83803302

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164594
Approved by: https://github.com/masnesral
This commit is contained in:
Colin L Reliability Rice
2025-10-06 18:15:21 +00:00
committed by PyTorch MergeBot
parent 4a6abba0d9
commit ba480d6bf7

View File

@ -27,7 +27,7 @@ def _async_compile_initializer(orig_ppid: int) -> None:
def run() -> None:
while True:
sleep(1)
sleep(60)
if orig_ppid != os.getppid():
os.kill(os.getpid(), signal.SIGKILL)