pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-11-11 22:34:53 +08:00

Files

Dan Johnson d22c4cc353 Add option to use mempool on OOM (#151487 )

MemPool is a separate pool of memory handled by the caching allocator. This PR adds the option let the caching allocator try to use this pool as a last resort instead of OOMing by associating a use_on_oom bool with each MemPool.

Usage:
Users can optionally specify a ``use_on_oom`` bool (which is False by default) during MemPool creation. If true, then the CUDACachingAllocator will be able to use memory in this pool as a last resort instead of OOMing.

```
pool = torch.cuda.MemPool(allocator, use_on_oom=True)
with torch.cuda.use_mem_pool(pool):
    a = torch.randn(40 * 1024 * 1024, dtype=torch.uint8, device="cuda")
del a
# at the memory limit, this will succeed by using pool's memory in order to avoid the oom
b = torch.randn(40 * 1024 * 1024, dtype=torch.uint8, device="cuda")
```

Testing:
```
python test/test_cuda.py -k test_mempool_limited_memory_with_allocator
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151487
Approved by: https://github.com/eqy, https://github.com/syed-ahmed, https://github.com/ngimel

2025-04-26 04:04:57 +00:00

impl

Improve error message for CUDAGuardImpl, MPSGuardImpl, XPUGuardImpl (#149838 )

2025-03-25 07:29:53 +00:00

test

[Windows][ROCm] Fix c10 hip tests (#146599 )

2025-02-06 23:41:25 +00:00

BUILD.bazel

create //c10/cuda library (#70863 )

2022-02-03 19:17:18 +00:00

build.bzl

Support expandable_segments:True in fbcode for caching allocator

2023-05-02 11:12:39 -07:00

CMakeLists.txt

Use torch_compile_options for c10 libraries (#147821 )

2025-03-18 01:54:23 +00:00

CUDAAlgorithm.h

[c10] Use nested namespace in c10/cuda (#116464 )

2023-12-27 23:14:00 +00:00

CUDAAllocatorConfig.cpp

[ROCm] enable HIPMallocAsyncAllocator (#149145 )

2025-03-19 23:42:35 +00:00

CUDAAllocatorConfig.h

[ROCm] enable HIPMallocAsyncAllocator (#149145 )

2025-03-19 23:42:35 +00:00

CUDACachingAllocator.cpp

Add option to use mempool on OOM (#151487 )

2025-04-26 04:04:57 +00:00

CUDACachingAllocator.h

Add option to use mempool on OOM (#151487 )

2025-04-26 04:04:57 +00:00

CUDADeviceAssertion.h

Suppress -Wunused-function for DSA (#150735 )

2025-04-07 01:47:35 +00:00

CUDADeviceAssertionHost.cpp

C10_UNUSED to [[maybe_unused]] (#6357 ) (#138364 )

2024-10-19 13:17:43 +00:00

CUDADeviceAssertionHost.h

[Clang-tidy header][15/N] Enable clang-tidy on headers in c10/cuda and c10/mobile (#116602 )

2024-01-18 08:15:50 +00:00

CUDAException.cpp

C10_UNUSED to [[maybe_unused]] (#6357 ) (#138364 )

2024-10-19 13:17:43 +00:00

CUDAException.h

C10_UNUSED to [[maybe_unused]] (#6357 ) (#138364 )

2024-10-19 13:17:43 +00:00

CUDAFunctions.cpp

Enable misc-use-internal-linkage check and apply fixes (#148948 )

2025-03-12 14:22:56 +00:00

CUDAFunctions.h

Revert "use copy2d in h2d/d2h copy when possible (#146256 )"

2025-02-25 07:06:38 +00:00

CUDAGraphsC10Utils.h

[4/N] Fix cppcoreguidelines-special-member-functions warnings (#139027 )

2024-10-29 00:18:18 +00:00

CUDAGuard.h

Enable more readability-redundant checks (#143963 )

2024-12-30 14:49:33 +00:00

CUDAMacros.h

Revert "Increase C10_COMPILE_TIME_MAX_GPUS to 128 (#144138 )"

2025-01-14 19:04:12 +00:00

CUDAMallocAsyncAllocator.cpp

[ROCm] enable HIPMallocAsyncAllocator (#149145 )

2025-03-19 23:42:35 +00:00

CUDAMathCompat.h

[Reland] [5/N] Change static functions in headers to inline (#131010 )

2024-07-18 15:53:48 +00:00

CUDAMiscFunctions.cpp

[Reland][Environment Variable][3/N] Use thread-safe getenv functions (#137942 )

2024-10-15 07:47:24 +00:00

CUDAMiscFunctions.h

[c10] Use nested namespace in c10/cuda (#116464 )

2023-12-27 23:14:00 +00:00

CUDAStream.cpp

Revert "Increase C10_COMPILE_TIME_MAX_GPUS to 128 (#144138 )"

2025-01-14 19:04:12 +00:00

CUDAStream.h

[Clang-tidy header][24/N] Fix clang-tidy warnings on c10/cuda/*.{cpp,h} (#120781 )

2024-03-15 05:03:22 +00:00

driver_api.cpp

[SymmetricMemory] introduce multicast support, multimem_all_reduce_ and multimem_one_shot_all_reduce (#133424 )

2024-08-23 20:09:20 +00:00

driver_api.h

Fix detection of GPU multicast (#150563 )

2025-04-03 15:31:15 +00:00

README.md

Move hipify to torch/utils to bundle them into torch package (#27425 )

2019-10-07 17:25:45 -07:00

README.md

c10/cuda is a core library with CUDA functionality. It is distinguished from c10 in that it links against the CUDA library, but like c10 it doesn't contain any kernels, and consists solely of core functionality that is generally useful when writing CUDA code; for example, C++ wrappers for the CUDA C API.

Important notes for developers. If you want to add files or functionality to this folder, TAKE NOTE. The code in this folder is very special, because on our AMD GPU build, we transpile it into c10/hip to provide a ROCm environment. Thus, if you write:

// c10/cuda/CUDAFoo.h
namespace c10 { namespace cuda {

void my_func();

}}

this will get transpiled into:

// c10/hip/HIPFoo.h
namespace c10 { namespace hip {

void my_func();

}}

Thus, if you add new functionality to c10, you must also update C10_MAPPINGS torch/utils/hipify/cuda_to_hip_mappings.py to transpile occurrences of cuda::my_func to hip::my_func. (At the moment, we do NOT have a catch all cuda:: to hip:: namespace conversion, as not all cuda namespaces are converted to hip::, even though c10's are.)

Transpilation inside this folder is controlled by CAFFE2_SPECIFIC_MAPPINGS (oddly enough.) C10_MAPPINGS apply to ALL source files.

If you add a new directory to this folder, you MUST update both c10/cuda/CMakeLists.txt and c10/hip/CMakeLists.txt