Synchronize in foreach tests after profiling (#152857)

After the CI change from 12.4 -> 12.6 around mid-March, the foreach tests have been flaky and hard to repro due to nondeterminism. Per @davidberard98's suggestion, let's try to add a synchronize before checking profiler results to see whether this fixes the flake! The hope is that the 48 currently open foreach flaky issues will close from this change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/152857 Approved by: https://github.com/davidberard98
2025-10-21 05:34:18 +08:00 · 2025-05-05 13:55:12 -07:00
parent 13dcf80a53
commit 4979ca5ffa
1 changed files with 2 additions and 0 deletions
--- a/test/test_foreach.py
+++ b/test/test_foreach.py
@ -86,6 +86,8 @@ class ForeachFuncWrapper:
        ):
            with torch.profiler.profile() as p:
                actual = self.func(*inputs, **kwargs)
+                # synchronize within the profiler context to make sure events happen before exiting
+                torch.cuda.synchronize()
            keys = tuple([e.key for e in p.key_averages()])
            mta_called = any("multi_tensor_apply_kernel" in k for k in keys)
            assert mta_called == (expect_fastpath and (not zero_size)), (