Releases multicast object before releasing mapped buffers in CUDASymmetricMemory (#163750)

Fixes: https://github.com/pytorch/pytorch/issues/162429. In B200, cuMulticastUnbind can error if the mapped buffers are free'd before the multicast object is free'd. The only documentation I could find is here: e11d7f77c1/src/transport/nvls.cc (L113).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163750
Approved by: https://github.com/ngimel, https://github.com/Skylion007, https://github.com/kwen2501, https://github.com/nWEIdia, https://github.com/cyyever
ghstack dependencies: #163575
This commit is contained in:
Syed Tousif Ahmed
2025-09-30 17:57:10 -07:00
committed by PyTorch MergeBot
parent 4dab208d97
commit ed90040d33
2 changed files with 6 additions and 5 deletions

View File

@ -52,10 +52,6 @@ from torch.testing._internal.common_utils import (
test_contexts = [nullcontext, _test_mode]
# Set environment variable to disable multicast for all tests in this module
# Workaround https://github.com/pytorch/pytorch/issues/162429
os.environ["TORCH_SYMM_MEM_DISABLE_MULTICAST"] = "1"
# So that tests are written in device-agnostic way
device_type = "cuda"
device_module = torch.get_device_module(device_type)