mirror of
https://github.com/pytorch/pytorch.git
synced 2025-10-20 21:14:14 +08:00
Original issue: https://github.com/pytorch/pytorch/issues/154820 The issue happens when there is a mutation for the same input in forward AND in backward. AOTD emited copy_ after joint_function tracing. This made this fx-node to correspond to the side effects of both mutations (in forward and in backward). After that partitioner can put it either in forward or in backward. The fix: 1/ Introduce joint_function.handle that allows to set "post_forward" callback, to be able to check inputs state after forward We do not want to apply the mutation after joint, if we already applied it in forward. For that we need "mutation_counter" and memorize the version of mutation that we applied for forward mutation. 2/ Exposing mutation_counter to python We want to keep invariant that copy_ exist only in the end of joint graph. 3/ We memorize mutation_counter and state of the inputs after forward, using the handle post_forward. Emit post_forward mutations after joint graph fully traced. add for post_forward mutations "must_be_in_forward" tag (similar to existing "must_be_in_backward") to keep them in forward. 4/ Ban recompute of the source of mutation. Recompute can apply the same op (e.g. add) in forward and backward. For this set MUST_SAVE for the source of mutation in forward. proxy_tensor changes: By default proxy tensor updates tensor_tracker. In this case applied mutations will be chained. But we want that this copy_ will be independent and applied just to primals. For this introducing a contextmanager to be able to disable update of tensor_tracker for adding forward mutations. Pull Request resolved: https://github.com/pytorch/pytorch/pull/155354 Approved by: https://github.com/bdhirsh
This folder contains a number of scripts which are used as
part of the PyTorch build process. This directory also doubles
as a Python module hierarchy (thus the __init__.py
).
Overview
Modern infrastructure:
- autograd - Code generation for autograd. This includes definitions of all our derivatives.
- jit - Code generation for JIT
- shared - Generic infrastructure that scripts in
tools may find useful.
- module_loader.py - Makes it easier to import arbitrary Python files in a script, without having to add them to the PYTHONPATH first.
Build system pieces:
- setup_helpers - Helper code for searching for third-party dependencies on the user system.
- build_pytorch_libs.py - cross-platform script that builds all of the constituent libraries of PyTorch, but not the PyTorch Python extension itself.
- build_libtorch.py - Script for building libtorch, a standalone C++ library without Python support. This build script is tested in CI.
Developer tools which you might find useful:
- git_add_generated_dirs.sh and git_reset_generated_dirs.sh - Use this to force add generated files to your Git index, so that you can conveniently run diffs on them when working on code-generation. (See also generated_dirs.txt which specifies the list of directories with generated files.)
Important if you want to run on AMD GPU:
- amd_build - HIPify scripts, for transpiling CUDA
into AMD HIP. Right now, PyTorch and Caffe2 share logic for how to
do this transpilation, but have separate entry-points for transpiling
either PyTorch or Caffe2 code.
- build_amd.py - Top-level entry point for HIPifying our codebase.
Tools which are only situationally useful:
- docker - Dockerfile for running (but not developing) PyTorch, using the official conda binary distribution. Context: https://github.com/pytorch/pytorch/issues/1619
- download_mnist.py - Download the MNIST dataset; this is necessary if you want to run the C++ API tests.