Summary: Resubmit #20698 which got messed up. Idea is that when PyTorch is used in a custom build environment (e.g. Facebook), it's useful to track usage of various APIs centrally. This PR introduces a simple very lightweight mechanism to do so - only first invocation of a trigger point would be logged. This is significantly more lightweight than #18235 and thus we can allow to put logging in e.g. TensorImpl. Also adds an initial list of trigger points. Trigger points are added in such a way that no static initialization triggers them, i.e. just linking with libtorch.so will not cause any logging. Further suggestions of what to log are welcomed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20745 Differential Revision: D15429196 Pulled By: dzhulgakov fbshipit-source-id: a5e41a709a65b7ebccc6b95f93854e583cf20aca
csrc
The csrc directory contains all of the code concerned with integration with Python. This is in contrast to lib, which contains the Torch libraries that are Python agnostic. csrc depends on lib, but not vice versa.
There are a number of utilities for easing integration with Python which are worth knowing about, which we briefly describe here. But the most important gotchas:
-
DO NOT forget to take out the GIL with
AutoGil
before calling Python API or bringing aTHPObjectPtr
into scope. -
Make sure you include
Python.h
first in your header files, before any system headers; otherwise, you will geterror: "_XOPEN_SOURCE" redefined
error. If you pay attention to warnings, you will see where you need to do this.
Notes
Note [Storage is not nullptr]
Historically, Torch supported nullptr storage, as a minor optimization to avoid having to allocate a storage object when it would be empty. However, this is actually a confusing special case to deal with, so by-in-large, PyTorch assumes that, in fact, storage is never nullptr.
One important case where this assumption is important is when tracking the CUDA device a tensor is stored in: this information is stored solely in the storage, so if a storage is nullptr, we lose this information.
Although storage is never nullptr, the data field of THStorage may be nullptr. This mostly occurs when we want to pre-allocate an output tensor struct, but then have it be resized and filled with data by some operator: there's no point in allocating data for it in this case!
Files
Exceptions.h
Frequently when working with the Python API, you may call a function which returns an error. In this case, we want to return directly to the Python interpreter, so that this exception can be propagated accordingly; however, because the Python API is C-based, what actually will happen is it will return control to whatever C++ code called it. Similarly, if we raise a C++ exception, prior to returning to the Python interpreter, we must set the Python error flags, so it turns into a C++ exception.
Exceptions defines some useful helpers: HANDLE_TH_ERRORS
, END_HANDLE_TH_ERRORS
and an exception class python_error
. You call them like this:
// Entry point from Python interpreter
PyObject* run() {
HANDLE_TH_ERRORS
...
if (!x) throw python_error();
...
END_HANDLE_TH_ERRORS
}
The HANDLE_TH_ERRORS
macro will catch all exceptions and convert them
into an appropriate Python signal. python_error
is a special
exception which doesn't contain any info, instead it says, "An error
occurred in the Python API; if you return to the interpreter, Python
will raise that exception, nothing else needs to be done."
utils/auto_gil.h
Whenever you make any calls to the Python API, you must have taken out
the Python GIL, as none of these calls are thread safe. AutoGIL
is
a RAII struct which handles taking and releasing the GIL. Use it like
this:
void iWantToUsePython() {
AutoGil gil;
...
}
In general, the compiler will NOT warn you if you use Python functionality without taking out the GIL, so DO NOT FORGET this call.
utils/object_ptr.h
THPPointer
is a smart pointer class analogous to std::shared_ptr
,
but which is overloaded to handle reference counting scheme of various
objects which are not based on shared_ptr
. The most important overloads are:
-
PyObject
(so important we've aliased it asTHPObjectPtr
), which hooks into Python reference counting. (By the way, that means you MUST take out the GIL before bringing one of these into scope!) -
The various TH tensor and storage types (e.g.,
THTensor
), which hook into TH's reference counting. (TH's reference counting IS thread safe, no locks necessary.)