pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Files

soulitzer 8bda95228f [autograd] Avoid creating and recording event when unnecessary (#157503 )

Today, we always create and record an events in two places:
1) Upon seeing the first producer, we record an event on the producer, and we wait for this event in two places: (1) when the engine goes to run the consumer, the consumer stream waits for this event. (2) prior to doing accumulation, the accumulation stream waits for this event.

2) After doing accumulation, we record an event on the accumulation stream and wait for this event in a single place: when the engine goes to run the consumer.

We do not actually need to record the event in the cases where the 1st producer stream is the same as the consumer and as the accumulation stream, and where the accumulation stream is the same as the consumer stream.

Removing this unnecessary create + record event should save a few us for each instance avoided.

Fixes https://github.com/pytorch/pytorch/issues/157407

----

Manual test plan:
- [x] @eqy to confirm perf is restored
- [x] Running the repro originally reported before/after the patch

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157503
Approved by: https://github.com/eqy
ghstack dependencies: #155715

2025-07-09 03:36:14 +00:00

functions

[BE][9/16] fix typos in torch/ (torch/csrc/) (#156319 )

2025-06-23 02:57:50 +00:00

utils

[ddp] propagate use_python_reducer to C++ reducer (#152735 )

2025-05-16 01:38:03 +00:00

anomaly_mode.cpp

[2/N] Fix clang-tidy warnings in torch/csrc/autograd (#133295 )

2024-08-13 13:23:46 +00:00

anomaly_mode.h

[BE][9/16] fix typos in torch/ (torch/csrc/) (#156319 )

2025-06-23 02:57:50 +00:00

autograd_meta.cpp

[BE][9/16] fix typos in torch/ (torch/csrc/) (#156319 )

2025-06-23 02:57:50 +00:00

autograd_not_implemented_fallback.cpp

[14/N] Fix extra warnings brought by clang-tidy-17 (#141644 )

2024-12-13 06:22:13 +00:00

autograd_not_implemented_fallback.h

…

autograd.cpp

[2/N] Fix clang-tidy warnings in torch/csrc/autograd (#133295 )

2024-08-13 13:23:46 +00:00

autograd.h

…

cpp_hook.cpp

[ca] introduce RuntimeState to support c++ hooks via graph breaks (#149987 )

2025-03-27 05:05:34 +00:00

cpp_hook.h

[ca] introduce RuntimeState to support c++ hooks via graph breaks (#149987 )

2025-03-27 05:05:34 +00:00

custom_function.cpp

Add missing in-place on view check to custom autograd.Function (#153094 )

2025-05-12 14:42:46 +00:00

custom_function.h

[reland][ca] side-effect free inital trace: compiled_args (#148376 )

2025-03-11 01:57:36 +00:00

edge.h

…

engine.cpp

[autograd] Avoid creating and recording event when unnecessary (#157503 )

2025-07-09 03:36:14 +00:00

engine.h

[ca] side-effect free initial trace: GraphTask (#147796 )

2025-02-26 16:37:27 +00:00

forward_grad.cpp

[Distributed] [13/N] Fix clang-tidy warnings in torch/csrc/distributed/ (#136713 )

2024-09-27 10:11:53 +00:00

forward_grad.h

[BE][9/16] fix typos in torch/ (torch/csrc/) (#156319 )

2025-06-23 02:57:50 +00:00

function_hook.h

[ca] suggest to disable compiled autograd for trace-time NotImplementedErrors (#156509 )

2025-06-21 18:33:46 +00:00

function.cpp

[2/N] Fix clang-tidy warnings in torch/csrc/autograd (#133295 )

2024-08-13 13:23:46 +00:00

function.h

[BE][9/16] fix typos in torch/ (torch/csrc/) (#156319 )

2025-06-23 02:57:50 +00:00

FunctionsManual.cpp

Enable Half dtype for logcumsumexp_backward (#157512 )

2025-07-03 18:13:38 +00:00

FunctionsManual.h

Fix 'dllimport attribute ignored on inline function' (#157670 )

2025-07-07 16:57:48 +00:00

grad_mode.h

…

graph_task.h

[2/N] Enable cppcoreguidelines-special-member-functions (#138670 )

2024-10-24 04:35:18 +00:00

InferenceMode.h

…

init.cpp

Add is_hidden_event method to KinetoEvent Python interface (#155214 )

2025-07-02 16:29:21 +00:00

input_buffer.cpp

[autograd] Avoid creating and recording event when unnecessary (#157503 )

2025-07-09 03:36:14 +00:00

input_buffer.h

Rewrite autograd producer consumer stream sync logic (#151079 )

2025-05-16 15:42:22 +00:00

input_metadata.cpp

used guard_or_false instead of guard_size_oblivious inside maybe_reduce (#154172 )

2025-05-26 21:59:52 +00:00

input_metadata.h

…

jit_decomp_interface.cpp

[2/N] Fix clang-tidy warnings in torch/csrc/autograd (#133295 )

2024-08-13 13:23:46 +00:00

jit_decomp_interface.h

[Lint] Update clang-format to 19.1.4 (#153889 )

2025-05-20 14:12:46 +00:00

profiler_kineto.cpp

Add is_hidden_event method to KinetoEvent Python interface (#155214 )

2025-07-02 16:29:21 +00:00

profiler_kineto.h

Add is_hidden_event method to KinetoEvent Python interface (#155214 )

2025-07-02 16:29:21 +00:00

profiler_legacy.cpp

[2/N] Fix clang-tidy warnings in torch/csrc/autograd (#133295 )

2024-08-13 13:23:46 +00:00

profiler_legacy.h

[2/N] Fix clang-tidy warnings in torch/csrc/autograd (#133295 )

2024-08-13 13:23:46 +00:00

profiler_python.cpp

Fix profiler on cpython-3.13 (#153848 )

2025-05-19 21:20:53 +00:00

profiler_python.h

…

profiler.h

…

python_anomaly_mode.cpp

More nogil unsafe API fix (#137142 )

2024-10-04 21:56:34 +00:00

python_anomaly_mode.h

[8/N] Fix extra warnings brought by clang-tidy-17 (#139151 )

2024-10-30 14:20:08 +00:00

python_autograd.h

[11/N] Fix extra warnings brought by clang-tidy-17 (#139599 )

2024-11-04 23:57:41 +00:00

python_cpp_function.cpp

Enable misc-use-internal-linkage check and apply fixes (#148948 )

2025-03-12 14:22:56 +00:00

python_cpp_function.h

Expose several APIs to public (torch python APIs) (#144525 )

2025-01-15 14:34:45 +00:00

python_engine.cpp

Fix clang-tidy bugprone* warnings (#148529 )

2025-06-23 23:09:56 +00:00

python_engine.h

[3/N] Fix clang-tidy warnings in torch/csrc/autograd (#133389 )

2024-08-16 00:57:54 +00:00

python_enum_tag.h

…

python_fft_functions.h

[11/N] Fix extra warnings brought by clang-tidy-17 (#139599 )

2024-11-04 23:57:41 +00:00

python_function.cpp

Fix clang-tidy bugprone* warnings (#148529 )

2025-06-23 23:09:56 +00:00

python_function.h

[reland][ca] side-effect free inital trace: compiled_args (#148376 )

2025-03-11 01:57:36 +00:00

python_hook.cpp

[reland][ca] side-effect free inital trace: compiled_args (#148376 )

2025-03-11 01:57:36 +00:00

python_hook.h

[reland][ca] side-effect free inital trace: compiled_args (#148376 )

2025-03-11 01:57:36 +00:00

python_legacy_variable.cpp

Enable misc-use-internal-linkage check and apply fixes (#148948 )

2025-03-12 14:22:56 +00:00

python_legacy_variable.h

…

python_linalg_functions.h

[11/N] Fix extra warnings brought by clang-tidy-17 (#139599 )

2024-11-04 23:57:41 +00:00

python_nested_functions_manual.cpp

Enable misc-use-internal-linkage check and apply fixes (#148948 )

2025-03-12 14:22:56 +00:00

python_nested_functions.h

[8/N] Fix extra warnings brought by clang-tidy-17 (#139151 )

2024-10-30 14:20:08 +00:00

python_nn_functions.h

Use clang-tidy 17 (#139678 )

2024-11-05 16:00:25 +00:00

python_saved_variable_hooks.cpp

[ca] trace saved variable unpacking (#147242 )

2025-02-26 16:37:17 +00:00

python_saved_variable_hooks.h

[ca] trace saved variable unpacking (#147242 )

2025-02-26 16:37:17 +00:00

python_sparse_functions.h

[11/N] Fix extra warnings brought by clang-tidy-17 (#139599 )

2024-11-04 23:57:41 +00:00

python_special_functions.h

[11/N] Fix extra warnings brought by clang-tidy-17 (#139599 )

2024-11-04 23:57:41 +00:00

python_torch_functions_manual.cpp

[aotd] Support mutations of the same input in fw and bw (#155354 )

2025-06-26 14:05:54 +00:00

python_torch_functions.h

Enable misc-use-internal-linkage check and apply fixes (#148948 )

2025-03-12 14:22:56 +00:00

python_variable_indexing.cpp

fix warning spam for list indexing (#155815 )

2025-06-12 23:07:24 +00:00

python_variable_indexing.h

[2/N] Fix clang-tidy warnings in torch/csrc/autograd (#133295 )

2024-08-13 13:23:46 +00:00

python_variable.cpp

Fix clang-tidy bugprone* warnings (#148529 )

2025-06-23 23:09:56 +00:00

python_variable.h

Use Wextra-semi (#140236 )

2024-11-13 02:15:16 +00:00

README.md

Rename torch::autograd::Function to torch::autograd::Node

2019-07-23 20:52:22 -07:00

record_function_ops.cpp

Enable misc-use-internal-linkage check and apply fixes (#148948 )

2025-03-12 14:22:56 +00:00

record_function_ops.h

…

saved_variable_hooks.h

[ca] trace saved variable unpacking (#147242 )

2025-02-26 16:37:17 +00:00

saved_variable.cpp

[aotd] Support saved tensors hooks in aot_autograd (#150032 )

2025-05-22 14:09:38 +00:00

saved_variable.h

[ca] trace saved variable unpacking (#147242 )

2025-02-26 16:37:17 +00:00

symbolic.h

…

TraceTypeManual.cpp

[BE][9/16] fix typos in torch/ (torch/csrc/) (#156319 )

2025-06-23 02:57:50 +00:00

variable_info.cpp

Fix autograd.Function + NJT when an output grad is None (#136875 )

2024-10-14 19:31:50 +00:00

variable_info.h

Fix autograd.Function + NJT when an output grad is None (#136875 )

2024-10-14 19:31:50 +00:00

variable.cpp

[8/N] Fix extra warnings brought by clang-tidy-17 (#139151 )

2024-10-30 14:20:08 +00:00

variable.h

Revert "Enable Leak Sanitizer (#154584 )"

2025-06-23 10:08:40 +00:00

VariableTypeManual.cpp

Revert "Enable Leak Sanitizer (#154584 )"

2025-06-23 10:08:40 +00:00

VariableTypeUtils.h

c10::string_view -> std::string_view in autograd (#142354 )

2024-12-10 15:43:41 +00:00

README.md

Autograd

Autograd is a hotspot for PyTorch performance, so most of the heavy lifting is implemented in C++. This implies that we have to do some shuffling between Python and C++; and in general, we want data to be in a form that is convenient to manipulate from C++.

Our general model is that for any key data type that autograd manipulates, there are two implementations: a C++ type and a Python object type. For example, consider variables in autograd: we have both Variable in variable.h (the C++ type) and THPVariable in python_variable.h (the Python type.) (By the way, THP stands for TorcH Python, not to be confused with THPP, TorcH C++). Variable contains the payload of a variable, while THPVariable just contains a shared_ptr reference to Variable, as well as references to other Python objects which the Python runtime needs to know about. A lot of data accessor implementations in python_variable.cpp simply reach through to the underlying Variable and return the appropriate value.

The most complicated application of this principle is Function, which also supports users implementing custom behavior in Python. We have the following classes:

Node in function.h, the C++ type.
THPFunction in python_function.h, the Python object type. In python_function.cpp, you can see the boilerplate that tells the Python interpreter about this object.
PyNode in python_function.h, a subclass of Node which forwards apply to a Python THPFunction. (NOT a Python object, despite its name!)

Outside of PyNode, the C++ objects largely avoid referencing Python objects (there are a few exceptions, like pyobj in Variable, and PyNode, whose whole point is to let C++ call into Python). And pyobj in Node to ensure uniqueness of the associated python wrapper (if it exists).