Files
pytorch/torch/csrc/autograd/function_hook.h
rzou 5531fafffe [compiled autograd] Proxy opaque nodes for built-in autograd nodes (#143296)
This PR is on the way to getting compiled autograd's initial capture to
stop specializing on Tensor metadata.

This PR changes compiled autograd's initial capture to proxy an opaque
(w.r.t. Dynamo) function into the graph for all built-in codegen'ed
autograd nodes and validate_outputs.

We changed each codegen'ed apply_with_saved (e.g.
MulBackward0::apply_with_saved) to call into Python to proxy a function
(compiled_autograd.ops.MulBackward0) into the graph. Then, we use the
node's InputMetadata to "guess" at the properties of the output Tensors
to create some new FakeTensors.

Some details:
- MulBackward0::apply_with_saved lives in libtorch_cpu, but needs to be
  call to Python via libtorch_python. There is an indirection
  (PyCompilerInterface) to do this.
- MulBackward0::apply_with_saved passes a C++ function to Python. To make
  our lives easier, every codegen'ed apply_with_saved passes a C++
  function with the same signature
  `(variable_list, ivalue_list) -> variable_list`.
- We define how to pack arbitrary C++ types into IValue via a helper
  IValuePacker struct and codegen functional variants of each builtin
  C++ autograd node (e.g. MulBackward0_apply_functional_ivalue).

MulBackward0 before this PR:
https://gist.github.com/zou3519/a80381d5fa38e970e413fcd91b0530de

MulBackward0 after this PR:
https://gist.github.com/zou3519/0c2eee8b3d8d96232b51ef430b53c5b0

Test Plan:
- existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143296
Approved by: https://github.com/jansel
2025-01-22 21:50:29 +00:00

66 lines
2.0 KiB
C++

#pragma once
#include <ATen/Tensor.h>
#include <torch/csrc/Export.h>
#include <string>
#include <vector>
namespace torch::dynamo::autograd {
class CompiledNodeArgs;
class SwapSavedVariables;
struct PackedArgs;
} // namespace torch::dynamo::autograd
// A hook that's called on gradients
namespace torch::autograd {
using Variable = at::Tensor;
using variable_list = std::vector<Variable>;
struct TORCH_API FunctionPreHook {
virtual ~FunctionPreHook() = default;
virtual variable_list operator()(const variable_list& grads) = 0;
// only implemented for python hooks, registers hook with compiled autograd
virtual void compiled_args(torch::dynamo::autograd::CompiledNodeArgs& args) {
throw std::runtime_error(
std::string("compiled_args nyi, see [Note: Compiled Autograd] ") +
typeid(*this).name());
}
};
struct TORCH_API FunctionPostHook {
virtual ~FunctionPostHook() = default;
virtual variable_list operator()(
const variable_list& outputs /* grad_inputs */,
const variable_list& inputs /* grad_outputs */) = 0;
// only implemented for python hooks, registers hook with compiled autograd
virtual void compiled_args(torch::dynamo::autograd::CompiledNodeArgs& args) {
throw std::runtime_error(
std::string("compiled_args nyi, see [Note: Compiled Autograd] ") +
typeid(*this).name());
}
};
struct TORCH_API PostAccumulateGradHook {
virtual ~PostAccumulateGradHook() = default;
virtual void operator()(const Variable& tensor) = 0;
// only implemented for python hooks on nodes, registers hook with compiled
// autograd
virtual void compiled_args(torch::dynamo::autograd::CompiledNodeArgs& args) {
throw std::runtime_error(
std::string("not yet implemented for compiled autograd: ") +
typeid(*this).name());
}
virtual void apply_with_saved(
Variable&,
torch::dynamo::autograd::SwapSavedVariables&) {
throw std::runtime_error(
std::string("not yet implemented for compiled autograd: ") +
typeid(*this).name());
}
};
} // namespace torch::autograd