Synchronize in foreach tests after profiling (#152857)

After the CI change from 12.4 -> 12.6 around mid-March, the foreach tests have been flaky and hard to repro due to nondeterminism. Per @davidberard98's suggestion, let's try to add a synchronize before checking profiler results to see whether this fixes the flake! The hope is that the 48 currently open foreach flaky issues will close from this change.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152857
Approved by: https://github.com/davidberard98
This commit is contained in:
Jane Xu
2025-05-05 13:55:12 -07:00
committed by PyTorch MergeBot
parent 13dcf80a53
commit 4979ca5ffa

View File

@ -86,6 +86,8 @@ class ForeachFuncWrapper:
):
with torch.profiler.profile() as p:
actual = self.func(*inputs, **kwargs)
# synchronize within the profiler context to make sure events happen before exiting
torch.cuda.synchronize()
keys = tuple([e.key for e in p.key_averages()])
mta_called = any("multi_tensor_apply_kernel" in k for k in keys)
assert mta_called == (expect_fastpath and (not zero_size)), (