Allow autograd to work even when the shape of values cannot be determined (#8641)

This commit implements the solution proposed in https://github.com/pytorch/pytorch/issues/8410 to workaround the need to create zero tensors with the same shape as inputs. It introduces the concept of a LinearBlock which marks places in the code where we know if all the inputs to the node are zero, then the outputs to the node are also zero. Autodiff introduces LinearBlocks around backwards functions, which have this property. specializeUndef then propagates Undef nodes using this information. Notes: * Since we do not always specialize, we have a pass LowerLinearBlocks that replaces the block with an if statement that dynamically guards the Undef case. * We introduce AutogradAdd which is addition that still works when its inputs might be undefined. In cases where we specialize this will get removed in favor of a normal add, but there are cases where gradient graphs do not specialize (e.g. when they are not differentiable, but a derivative is required) so it is important for this op to be executable.
2025-10-21 05:34:18 +08:00 · 2018-06-25 18:40:04 -07:00
parent 7a614799f7
commit f74207c99f
16 changed files with 367 additions and 207 deletions
--- a/setup.py
+++ b/setup.py
@ -768,12 +768,14 @@ main_sources = [
    "torch/csrc/jit/passes/dead_code_elimination.cpp",
    "torch/csrc/jit/passes/remove_expands.cpp",
    "torch/csrc/jit/passes/lower_tuples.cpp",
+    "torch/csrc/jit/passes/lower_grad_of.cpp",
    "torch/csrc/jit/passes/common_subexpression_elimination.cpp",
    "torch/csrc/jit/passes/peephole.cpp",
    "torch/csrc/jit/passes/inplace_check.cpp",
    "torch/csrc/jit/passes/canonicalize.cpp",
    "torch/csrc/jit/passes/batch_mm.cpp",
    "torch/csrc/jit/passes/decompose_addmm.cpp",
+    "torch/csrc/jit/passes/specialize_undef.cpp",
    "torch/csrc/jit/passes/erase_number_types.cpp",
    "torch/csrc/jit/passes/loop_unrolling.cpp",
    "torch/csrc/jit/passes/onnx/peephole.cpp",