mirror of
https://github.com/pytorch/pytorch.git
synced 2025-10-20 21:14:14 +08:00
torch.compile: Increase subprocess parent death check interval to lower cpu (#164594)
Summary: This check is a good idea (we could potentially do it with prctl). However we're seeing elevated rates of cpu usage in idle worker threads. This causes issues on production jobs, causing a large amount of spikeness in qps. Test Plan: Tested on a prod job with caches force disabled via TORCH_COMPILE_FORCE_DISABLE_CACHES=1 Baseline <img width="454" height="403" alt="image" src="https://github.com/user-attachments/assets/b88583a1-5b99-48cb-b03d-cd9b69546579" /> With this diff - <img width="426" height="403" alt="image" src="https://github.com/user-attachments/assets/431217f1-0ed0-4f6e-9d81-6428bf34e0e3" /> Differential Revision: D83803302 Pull Request resolved: https://github.com/pytorch/pytorch/pull/164594 Approved by: https://github.com/masnesral
This commit is contained in:
committed by
PyTorch MergeBot
parent
4a6abba0d9
commit
ba480d6bf7
@ -27,7 +27,7 @@ def _async_compile_initializer(orig_ppid: int) -> None:
|
||||
|
||||
def run() -> None:
|
||||
while True:
|
||||
sleep(1)
|
||||
sleep(60)
|
||||
if orig_ppid != os.getppid():
|
||||
os.kill(os.getpid(), signal.SIGKILL)
|
||||
|
||||
|
Reference in New Issue
Block a user