mirror of
https://github.com/pytorch/pytorch.git
synced 2025-10-20 21:14:14 +08:00
Uses context pointer for deleter to enable multiple CUDAPluggableAllocator usage (#130472)
We should be able to create multiple CUDAPluggableAllocators in the same pytorch program (see https://github.com/pytorch/pytorch/issues/124807, https://github.com/pytorch/pytorch/pull/125722 for context). When mixing CUDAPluggableAllocators in the same pytorch program, we need to make sure that the deleter passed in through the CUDAPluggableAllocator gets "attached" to the data_ptr and persist until program exit (when it's called to free the memory). Currently, CUDAPluggableAllocator maintains a global `current_custom_allocator`. When creating the `DataPtr`, `raw_deleter` attaches `custom_raw_deleter` to the DataPtr which calls `current_custom_allocator->raw_delete(...)`. This approach is fine when using only one allocator, however for multiple allocator use case, DataPtr would be using the deleter of whatever is in the `current_custom_allocator`. For example, if allocation 1 was done with `cudaMalloc` and allocation 2 was done with `ncclMemAlloc`, and if `current_custom_allocator` is currently pointing to the CUDAPluggableAllocator with `ncclMemAlloc` - when cleaning up the allocation 1, we'd be using `ncclMemFree` instead of `cudaFree`. In this PR, we solve the above problem by remembering the `free_fn_` using a deleter context. Hence, there is no need to go through an allocator object to find the deleter. CC: @zdevito @ptrblck @eqy Pull Request resolved: https://github.com/pytorch/pytorch/pull/130472 Approved by: https://github.com/eqy, https://github.com/ezyang
This commit is contained in:
committed by
PyTorch MergeBot
parent
28a74b9fa4
commit
38b7d89aa4
@ -661,6 +661,7 @@ libtorch_cuda_core_sources = [
|
||||
"torch/csrc/CudaIPCTypes.cpp",
|
||||
"torch/csrc/cuda/comm.cpp",
|
||||
"torch/csrc/cuda/memory_snapshot.cpp",
|
||||
"torch/csrc/cuda/CUDAPluggableAllocator.cpp",
|
||||
"torch/csrc/inductor/aoti_runner/model_container_runner_cuda.cpp",
|
||||
"torch/csrc/inductor/aoti_torch/shim_cuda.cpp",
|
||||
"torch/csrc/jit/codegen/fuser/cuda/fused_kernel.cpp",
|
||||
@ -772,7 +773,6 @@ libtorch_python_cuda_core_sources = [
|
||||
"torch/csrc/cuda/shared/cudart.cpp",
|
||||
"torch/csrc/cuda/shared/nvtx.cpp",
|
||||
"torch/csrc/cuda/utils.cpp",
|
||||
"torch/csrc/cuda/CUDAPluggableAllocator.cpp",
|
||||
]
|
||||
|
||||
libtorch_python_cuda_sources = libtorch_python_cuda_core_sources + [
|
||||
|
Reference in New Issue
Block a user