mirror of
https://github.com/pytorch/pytorch.git
synced 2025-10-21 13:44:15 +08:00
Releases multicast object before releasing mapped buffers in CUDASymmetricMemory (#163750)
Fixes: https://github.com/pytorch/pytorch/issues/162429. In B200, cuMulticastUnbind can error if the mapped buffers are free'd before the multicast object is free'd. The only documentation I could find is here: e11d7f77c1/src/transport/nvls.cc (L113)
.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163750
Approved by: https://github.com/ngimel, https://github.com/Skylion007, https://github.com/kwen2501, https://github.com/nWEIdia, https://github.com/cyyever
ghstack dependencies: #163575
This commit is contained in:
committed by
PyTorch MergeBot
parent
4dab208d97
commit
ed90040d33
@ -52,10 +52,6 @@ from torch.testing._internal.common_utils import (
|
||||
|
||||
test_contexts = [nullcontext, _test_mode]
|
||||
|
||||
# Set environment variable to disable multicast for all tests in this module
|
||||
# Workaround https://github.com/pytorch/pytorch/issues/162429
|
||||
os.environ["TORCH_SYMM_MEM_DISABLE_MULTICAST"] = "1"
|
||||
|
||||
# So that tests are written in device-agnostic way
|
||||
device_type = "cuda"
|
||||
device_module = torch.get_device_module(device_type)
|
||||
|
Reference in New Issue
Block a user