mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Files

Laith Sakka e444cd24d4 Remove guard_size_oblivious from default contiguity python check, and add aten.sym_is_contiguous. (#159197 )

This might cause some new DDEs on call sites that do not use is_contiguous_or_false() or sym_is_contiguous()
but want to find those call sites to handle this properly by calling  is_contiguous_or_false() and not is_contiguous() explitly when appropriate.
I had to fix one issue after removing the implicit size oblivious reasoning. here is context

we defined in this https://github.com/pytorch/pytorch/pull/157472 sym_is_contiguous to be the function computing contiguity for dynamic shapes in c++. It returns a symbolic expression that represents contiguity and guaranteed not to throw a DDE.

when people call is_contiguous we do sym_is_contiguous().guard_bool()
when people call is_contiguous_or_false we do sym_is_contiguous().guard_or_false()

one issue not handled well was this path
```
c10::SymBool TensorImpl::sym_is_contiguous_custom(
    at::MemoryFormat memory_format) const {
  if (C10_UNLIKELY(matches_python_custom(SizesStridesPolicy::CustomStrides))) {
    return pyobj_slot_.load_pyobj_interpreter()->is_contiguous(
        this, memory_format);
  }

  return sym_is_contiguous_default(memory_format);
}
```
namely if we call sym_is_contiguous_custom but we have matches_python_custom(SizesStridesPolicy::CustomStrides) return true , then we used to call is_contiguous(this, memory_format);

This used to go through the load_pyobj_interpreter and end up calling the python is_contiguous call which used implicit size oblivious reasoning.
once we removed that implicit size oblivious reasoning, the right thing we want is to call
return pyobj_slot_.load_pyobj_interpreter()->sym_is_contiguous(this, memory_format);
otherwise we would get DDE even if the caller is doing sym_is_contiguous.

so I had to define it for pyinterpreter, and then I had to override it for nested tensors.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159197
Approved by: https://github.com/ezyang

2025-08-16 09:15:58 +00:00

api

[nativert] Expose ModelRunner to public through pmpl type ModelRunnerHandle. (#159989 )

2025-08-07 14:23:21 +00:00

autograd

guard_or_false cat ops (#160250 )

2025-08-16 00:54:31 +00:00

cpu

[CPUInductor] Fix SVE256 detection (#146207 )

2025-02-01 18:51:34 +00:00

cuda

Generalize torch._C._set_allocator_settings to be generic (#156175 )

2025-08-05 04:08:42 +00:00

distributed

[SymmMem] Check return of nvshmem_malloc (#160603 )

2025-08-14 15:57:55 +00:00

dynamo

[dynamo][guards] Install dict watchers for recrusive dict tag optimization (#159796 )

2025-08-12 09:49:11 +00:00

export

[schema_upgrader] add C++ upgrader for json based upgrading (#156761 )

2025-06-28 18:15:06 +00:00

functorch

[dynamo] Guard serialization for FUNCTORCH_STACK_MATCH (#152616 )

2025-05-05 18:05:56 +00:00

Fix clang-tidy bugprone* warnings (#148529 )

2025-06-23 23:09:56 +00:00

inductor

[AOTInductor] ABI-Compatibility for RecordFunction. (#159842 )

2025-08-15 21:45:47 +00:00

instruction_counter

[BE]: Replace a couple of call sites with fmtlib printf (#154533 )

2025-06-01 21:16:34 +00:00

jit

Remove guard_size_oblivious from default contiguity python check, and add aten.sym_is_contiguous. (#159197 )

2025-08-16 09:15:58 +00:00

lazy

[BE] remove torch deploy - conditionals (#158288 )

2025-07-29 17:40:49 +00:00

monitor

Fix 'dllimport attribute ignored on inline function' (#157670 )

2025-07-07 16:57:48 +00:00

mps

[MPS] Add API to query GPU core count (#160414 )

2025-08-14 00:05:17 +00:00

mtia

[Re-land][Inductor] Support native Inductor as backend for MTIA (#159211 )

2025-07-29 17:03:24 +00:00

multiprocessing

Enable misc-use-internal-linkage check and apply fixes (#148948 )

2025-03-12 14:22:56 +00:00

onnx

[ONNX] Clean up the diagnostics module (#149864 )

2025-03-26 05:58:32 +00:00

profiler

[Profiler] Update README (#159816 )

2025-08-07 16:44:41 +00:00

stable

Add getCurrentDeviceIndex to torch::stable::accelerator (#160453 )

2025-08-13 23:42:24 +00:00

tensor

Expose functions used in custom backend in torch_python dll (#148213 )

2025-03-07 02:34:37 +00:00

utils

[cuda][cupy] Improve cupy device placement when device is provided with explicit index (#158529 )

2025-08-15 00:27:42 +00:00

xpu

Add device_id to XPU device properties (#156481 )

2025-07-03 01:22:11 +00:00

copy_utils.h

…

CudaIPCTypes.cpp

Enable more readability-redundant checks (#143963 )

2024-12-30 14:49:33 +00:00

CudaIPCTypes.h

…

DataLoader.cpp

[Lint] Update clang-format to 19.1.4 (#153889 )

2025-05-20 14:12:46 +00:00

DataLoader.h

…

Device.cpp

Remove unsafe PyTorchError constructor (#154961 )

2025-07-11 18:22:53 +00:00

Device.h

…

DeviceAccelerator.cpp

Add unified memory APIs for torch.accelerator (#152932 )

2025-08-08 17:41:22 +00:00

DeviceAccelerator.h

…

Dtype.cpp

…

Dtype.h

…

DynamicTypes.cpp

[18/N] Fix extra warnings brought by clang-tidy-17 (#144014 )

2025-01-08 17:21:55 +00:00

DynamicTypes.h

Expose several APIs to public (torch python APIs) (#144525 )

2025-01-15 14:34:45 +00:00

empty.c

…

Event.cpp

Fix clang-tidy bugprone* warnings (#148529 )

2025-06-23 23:09:56 +00:00

Event.h

…

Exceptions.cpp

Remove unsafe PyTorchError constructor (#154961 )

2025-07-11 18:22:53 +00:00

Exceptions.h

Raise BufferError for DLPack buffer-related errors. (#150691 )

2025-07-20 00:46:21 +00:00

Export.h

…

Generator.cpp

Remove unsafe PyTorchError constructor (#154961 )

2025-07-11 18:22:53 +00:00

Generator.h

[17/N] Fix extra warnings brought by clang-tidy-17 (#143804 )

2024-12-25 19:54:42 +00:00

itt_wrapper.cpp

Enable misc-use-internal-linkage check and apply fixes (#148948 )

2025-03-12 14:22:56 +00:00

itt_wrapper.h

…

itt.cpp

[1/N] Use internal linkage in torch/csrc C++ files. (#150930 )

2025-04-11 02:19:31 +00:00

itt.h

[1/N] Use internal linkage in torch/csrc C++ files. (#150930 )

2025-04-11 02:19:31 +00:00

Layout.cpp

…

Layout.h

Hide torch_python symbols (#142214 )

2024-12-16 00:59:26 +00:00

MemoryFormat.cpp

[18/N] Fix extra warnings brought by clang-tidy-17 (#144014 )

2025-01-08 17:21:55 +00:00

MemoryFormat.h

Hide torch_python symbols (#142214 )

2024-12-16 00:59:26 +00:00

Module.cpp

[MPS] Add API to query GPU core count (#160414 )

2025-08-14 00:05:17 +00:00

Module.h

…

PyInterpreter.cpp

Remove guard_size_oblivious from default contiguity python check, and add aten.sym_is_contiguous. (#159197 )

2025-08-16 09:15:58 +00:00

PyInterpreter.h

[BE] Make PyObjectSlot use a global PyInterpreter and remove (#158427 )

2025-07-30 17:29:43 +00:00

PyInterpreterHooks.cpp

[BE] Fix extra-semi warnings (#158730 )

2025-07-22 01:05:03 +00:00

PyInterpreterHooks.h

[BE] Make PyObjectSlot use a global PyInterpreter and remove (#158427 )

2025-07-30 17:29:43 +00:00

python_dimname.cpp

[18/N] Fix extra warnings brought by clang-tidy-17 (#144014 )

2025-01-08 17:21:55 +00:00

python_dimname.h

…

python_headers.h

…

QScheme.cpp

[18/N] Fix extra warnings brought by clang-tidy-17 (#144014 )

2025-01-08 17:21:55 +00:00

QScheme.h

Hide torch_python symbols (#142214 )

2024-12-16 00:59:26 +00:00

README.md

…

serialization.cpp

Enable misc-use-internal-linkage check and apply fixes (#148948 )

2025-03-12 14:22:56 +00:00

serialization.h

…

Size.cpp

Implemented Size.__radd__ (#152554 )

2025-06-23 15:38:37 +00:00

Size.h

Hide torch_python symbols (#142214 )

2024-12-16 00:59:26 +00:00

Storage.cpp

[BE] Make PyObjectSlot use a global PyInterpreter and remove (#158427 )

2025-07-30 17:29:43 +00:00

Storage.h

[BE] Make PyObjectSlot use a global PyInterpreter and remove (#158427 )

2025-07-30 17:29:43 +00:00

StorageMethods.cpp

[BE] Make PyObjectSlot use a global PyInterpreter and remove (#158427 )

2025-07-30 17:29:43 +00:00

StorageMethods.h

…

StorageSharing.cpp

[BE] Make PyObjectSlot use a global PyInterpreter and remove (#158427 )

2025-07-30 17:29:43 +00:00

StorageSharing.h

…

Stream.cpp

Support with statement on torch.Stream (#140138 )

2025-01-10 02:05:19 +00:00

Stream.h

[1/N] OpenReg: Replace open_registration_extension.cpp with openreg (#141815 )

2025-01-14 15:59:00 +00:00

stub.c

…

THConcat.h

…

THP.h

…

TypeInfo.cpp

add the torch.float8_e8m0fnu dtype to PyTorch (#147466 )

2025-02-20 13:55:42 +00:00

TypeInfo.h

Hide torch_python symbols (#142214 )

2024-12-16 00:59:26 +00:00

Types.h

…

utils.cpp

[BE][7/16] fix typos in torch/ (torch/csrc/) (#156317 )

2025-06-23 02:57:41 +00:00

utils.h

[Lint] Update clang-format to 19.1.4 (#153889 )

2025-05-20 14:12:46 +00:00

README.md

csrc

The csrc directory contains all of the code concerned with integration with Python. This is in contrast to lib, which contains the Torch libraries that are Python agnostic. csrc depends on lib, but not vice versa.

There are a number of utilities for easing integration with Python which are worth knowing about, which we briefly describe here. But the most important gotchas:

DO NOT forget to take out the GIL with pybind11::gil_scoped_acquire before calling Python API or bringing a THPObjectPtr into scope.
Make sure you include Python.h first in your header files, before any system headers; otherwise, you will get error: "_XOPEN_SOURCE" redefined error. If you pay attention to warnings, you will see where you need to do this.

Notes

Note [Storage is not nullptr]

Historically, Torch supported nullptr storage, as a minor optimization to avoid having to allocate a storage object when it would be empty. However, this is actually a confusing special case to deal with, so by-in-large, PyTorch assumes that, in fact, storage is never nullptr.

One important case where this assumption is important is when tracking the CUDA device a tensor is stored in: this information is stored solely in the storage, so if a storage is nullptr, we lose this information.

Although storage is never nullptr, the data field of c10::StorageImpl may be nullptr. This mostly occurs when we want to pre-allocate an output tensor struct, but then have it be resized and filled with data by some operator: there's no point in allocating data for it in this case!

Files

`Exceptions.h`

Frequently when working with the Python API, you may call a function which returns an error. In this case, we want to return directly to the Python interpreter, so that this exception can be propagated accordingly; however, because the Python API is C-based, what actually will happen is it will return control to whatever C++ code called it. Similarly, if we raise a C++ exception, prior to returning to the Python interpreter, we must set the Python error flags, so it turns into a C++ exception.

Moreover, when using the following macros, the generated warnings will be converted into python warnings that can be caught by the user.

Exceptions define helpers for two main cases:

For code where you write the python binding by hand, HANDLE_TH_ERRORS, END_HANDLE_TH_ERRORS and an exception class python_error. You call them like this:

// Entry point from Python interpreter
PyObject* run(PyObject* arg) {
  HANDLE_TH_ERRORS
  ...
  if (!x) throw python_error();
  // From c10/Exception.h
  TORCH_CHECK(cond, "cond was false here");
  TORCH_WARN("Warning message");
  ...
  END_HANDLE_TH_ERRORS
}

The HANDLE_TH_ERRORS macro will catch all exceptions and convert them into an appropriate Python signal. python_error is a special exception which doesn't contain any info, instead it says, "An error occurred in the Python API; if you return to the interpreter, Python will raise that exception, nothing else needs to be done."

For code that you bind using pybind, HANDLE_TH_ERRORS and END_HANDLE_TH_ERRORS_PYBIND can be used. They will work jointly with pybind error handling to raise pytorch errors and warnings natively and let pybind handle other errors. It can be used as:

// Function given to the pybind binding
at::Tensor foo(at::Tensor x) {
  HANDLE_TH_ERRORS
  ...
  if (!x) throw python_error();
  // pybind native error
  if (!x) throw py::value_error();
  // From c10/Exception.h
  TORCH_CHECK(cond, "cond was false here");
  TORCH_WARN("Warning message");
  ...
  END_HANDLE_TH_ERRORS_PYBIND
}

GIL

Whenever you make any calls to the Python API, you must have taken out the Python GIL, as none of these calls are thread safe. pybind11::gil_scoped_acquire is a RAII struct which handles taking and releasing the GIL. Use it like this:

void iWantToUsePython() {
  pybind11::gil_scoped_acquire gil;
  ...
}

In general, the compiler will NOT warn you if you use Python functionality without taking out the GIL, so DO NOT FORGET this call.

`utils/object_ptr.h`

THPPointer is a smart pointer class analogous to std::shared_ptr, but which is overloaded to handle reference counting scheme of various objects which are not based on shared_ptr. The most important overloads are:

PyObject (so important we've aliased it as THPObjectPtr), which hooks into Python reference counting. (By the way, that means you MUST take out the GIL before bringing one of these into scope!)
The various TH tensor and storage types (e.g., THTensor), which hook into TH's reference counting. (TH's reference counting IS thread safe, no locks necessary.)