Commit Graph

625 Commits

Author SHA1 Message Date
f4f1a5b5b3 Revert "Move functional collectives to the right namespace (#97793)"
This reverts commit 184bfbc3d7b37e8f202f4938f6ea9ba557c93b1e.

Reverted https://github.com/pytorch/pytorch/pull/97793 on behalf of https://github.com/atalman due to breaks internal builds
2023-03-31 16:02:07 +00:00
184bfbc3d7 Move functional collectives to the right namespace (#97793)
This moves them from `torch._C._nn` to `torch._C._dist`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97793
Approved by: https://github.com/albanD
2023-03-30 22:18:13 +00:00
5ab50cf048 Fix shoud/shoudl typos (#97930)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97930
Approved by: https://github.com/clee2000
2023-03-30 08:27:16 +00:00
6b691b99da add amp support for custom backend (#96188)
Fixes #ISSUE_NUMBER
1、add amp support for custom backend
2、optimize the file `backend_registration.py`, and rename it with `custom_backend_registration.py`. And then we would register other funcs for custom backend.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96188
Approved by: https://github.com/bdhirsh
2023-03-20 20:27:35 +00:00
a8f36dd646 Revert "add amp support for custom backend (#96188)"
This reverts commit cf12edee02a44009c4f06e36efa97d9a7372ab35.

Reverted https://github.com/pytorch/pytorch/pull/96188 on behalf of https://github.com/kit1980 due to Broke some linalg tests : https://github.com/pytorch/pytorch/actions/runs/4420037607/jobs/7750708339
2023-03-15 00:03:19 +00:00
cf12edee02 add amp support for custom backend (#96188)
Fixes #ISSUE_NUMBER
1、add amp support for custom backend
2、optimize the file `backend_registration.py`, and rename it with `custom_backend_registration.py`. And then we would register other funcs for custom backend.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96188
Approved by: https://github.com/bdhirsh
2023-03-14 20:43:21 +00:00
da265652d6 Return Live Data Pointers from Checkpoint, swap onto tensors (#95020)
When we checkpoint the state of the private pool allocator, we will need to make sure that its current live allocated blocks will get properly cleaned up when the tensors they correspond to die. Return DataPtrs for these new allocated blocks that the callee can swap onto live Tensors.

The exact api for setting the checkpoint can be manipulated after this as the cudagraph implementation is built out, but this at least shows its sufficiently general.

This should be the last PR touching cuda caching allocator necessary for new cudagraphs integration.

Differential Revision: [D43999888](https://our.internmc.facebook.com/intern/diff/D43999888)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95020
Approved by: https://github.com/zdevito
2023-03-14 01:22:19 +00:00
cyy
f27e09de04 Cleanup Windows warning suppression in CMake and fix some warnings in the source code (#94927)
This PR do two things:
1. It moves some Windows warning suppression from various CMake files into the main CMakeList.txt, following the conventions of gcc and clang.
2. It fixes some Windows warnings in the source code. Most importantly, it fixes lots of dll warnings by adjusting C10_API to TORCH_API or TORCH_PYTHON_API. There are still some dll warnings because some TORCH_API functions are actually built as part of libtorch_python

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94927
Approved by: https://github.com/malfet
2023-02-27 19:22:20 +00:00
bdd8f518d7 [MPS] Add Python Module Bindings for the MPS backend (#94417)
- This PR is a prerequisite for the upcoming Memory Leak Detection PR.
- Enable global manual seeding via `torch.manual_seed()` + test case
- Add `torch.mps.synchronize()` to wait for MPS stream to finish + test case
- Enable the following python interfaces for MPS:
  `torch.mps.[get_rng_state(), set_rng_state(), synchronize(), manual_seed(), seed()]`
- Added some test cases in test_mps.py
- Added `mps.rst` to document the `torch.mps` module.
- Fixed the failure with `test_public_bindings.py`

Description of new files added:
- `torch/csrc/mps/Module.cpp`: implements `torch._C` module functions for `torch.mps` and `torch.backends.mps`.
- `torch/mps/__init__.py`: implements Python bindings for `torch.mps` module.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94417
Approved by: https://github.com/albanD
2023-02-12 21:22:30 +00:00
4fe365774a Revert "[MPS] Add Python Module Bindings for the MPS backend (#94417)"
This reverts commit beb4f5bf396ec2d53defa73c81aac48c38360544.

Reverted https://github.com/pytorch/pytorch/pull/94417 on behalf of https://github.com/huydhn due to Sorry for reverting your PR, but it seems to break MacOS test in trunk bae397ec63
2023-02-11 05:24:45 +00:00
beb4f5bf39 [MPS] Add Python Module Bindings for the MPS backend (#94417)
- This PR is a prerequisite for the upcoming Memory Leak Detection PR.
- Enable global manual seeding via `torch.manual_seed()` + test case
- Add `torch.mps.synchronize()` to wait for MPS stream to finish + test case
- Enable the following python interfaces for MPS:
  `torch.mps.[get_rng_state(), set_rng_state(), synchronize(), manual_seed(), seed()]`
- Added some test cases in test_mps.py
- Added `mps.rst` to document the `torch.mps` module.
- Fixed the failure with `test_public_bindings.py`

Description of new files added:
- `torch/csrc/mps/Module.cpp`: implements `torch._C` module functions for `torch.mps` and `torch.backends.mps`.
- `torch/mps/__init__.py`: implements Python bindings for `torch.mps` module.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94417
Approved by: https://github.com/albanD
2023-02-10 23:18:41 +00:00
70f4b3551c Add Hook to store arbitrary python objects that are copied over in tls (#89169)
For the cudagraphs implementation, we would like to reuse objects that are defined in python across the forward and backward. The backward is run in a different thread, so to handle this we add an api for copying over arbitrary python objects in pytorch's thread local state, in the same way that C++ objects are copied over currently.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89169
Approved by: https://github.com/albanD
2023-01-24 05:24:57 +00:00
b3e4f5029b Add check-sparse-tensor-invariants flag to Context - 2nd try. (#92094)
This PR is a copy of https://github.com/pytorch/pytorch/pull/90849 that merge was reverted.

The PR adds "check sparse tensor invariants" flag to Context that when enabled will trigger sparse tensor data invariants checks in unsafe methods of constructing sparse COO/CSR/CSC/BSR/BSC tensors. The feature includes the following changes to UI:

`torch.sparse.check_sparse_tensor_invariants` class provides different ways to enable/disable the invariant checking.

`torch.sparse_coo/csr/csc/bsr/bsc/compressed_tensor` functions have a new optional argument `check_invariants` to enable/disable the invariant checks explicitly. When the `check_invariants` argument is specified, the global state of the feature is temporarily overridden.

The PR fixes https://github.com/pytorch/pytorch/issues/90833

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92094
Approved by: https://github.com/cpuhrsch
2023-01-13 14:50:33 +00:00
c7a22bb7c7 Revert "Add check-sparse-tensor-invariants flag to Context. (#90849)"
This reverts commit b9a035c1c58630f3eef5242cb4849881b8376b39.

Reverted https://github.com/pytorch/pytorch/pull/90849 on behalf of https://github.com/DanilBaibak due to Break internal build
2023-01-12 09:58:16 +00:00
b8252e07c7 [Reland] add DisableTorchFunction that matches DisableTorchDispatch (#88219) (#92012)
Reland of #88219

Closes #87990. This implements a new disable guard that matches DisableTorchDispatch (disables all subclasses and modes)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92012
Approved by: https://github.com/albanD
2023-01-12 01:27:47 +00:00
b9a035c1c5 Add check-sparse-tensor-invariants flag to Context. (#90849)
This PR adds "check sparse tensor invariants" flag to Context that when enabled will trigger sparse tensor data invariants checks in unsafe methods of constructing sparse COO/CSR/CSC/BSR/BSC tensors. The feature includes the following changes to UI:

- `torch.enable_check_sparse_tensor_invariants` and `torch.is_check_sparse_tensor_invariants_enabled` functions to globally enable/disable the invariant checks and to retrieve the state of the feature, respectively
- `torch.sparse_coo/csr/csc/bsr/bsc/compressed_tensor` functions have a new optional argument `check_invariants` to enable/disable the invariant checks explicitly. When the `check_invariants` argument is specified, the global state of the feature is temporarily overridden.

The PR also fixes https://github.com/pytorch/pytorch/issues/90833

# Main issue

*The following content is outdated after merging the PRs in this ghstack but kept for the record.*

The importance of this feature is that when enabling the invariants checks by default, say, via

<details>

```
$ git diff
diff --git a/torch/__init__.py b/torch/__init__.py
index c8543057c7..19a91d0482 100644
--- a/torch/__init__.py
+++ b/torch/__init__.py
@@ -1239,3 +1239,8 @@ if 'TORCH_CUDA_SANITIZER' in os.environ:

 # Populate magic methods on SymInt and SymFloat
 import torch.fx.experimental.symbolic_shapes
+
+# temporarily enable sparse tensor arguments validation in unsafe
+# constructors:
+
+torch._C._set_check_sparse_tensor_invariants(True)
```

</details>

a massive number of test failures/errors occur in test_sparse_csr.py tests:
```
$ pytest -sv test/test_sparse_csr.py
<snip>
==== 4293 failed, 1557 passed, 237 skipped, 2744 errors in 69.71s (0:01:09) ====
```
that means that we are silently constructing sparse compressed tensors that do not satisfy the sparse tensor invariants. In particular, the following errors are raised:

```
AssertionError: "resize_as_sparse_compressed_tensor_: self and src must have the same layout" does not match "expected values to be a strided and contiguous tensor"

RuntimeError: CUDA error: device-side assert triggered

RuntimeError: `col_indices[..., crow_indices[..., i - 1]:crow_indices[..., i]] for all i = 1, ..., nrows are sorted and distinct along the last dimension values` is not satisfied.

RuntimeError: expected col_indices to be a strided and contiguous tensor

RuntimeError: expected row_indices to be a strided and contiguous tensor

RuntimeError: expected values to be a strided and contiguous tensor

RuntimeError: for_each: failed to synchronize: cudaErrorAssert: device-side assert triggered

RuntimeError: tensor dimensionality must be sum of batch, base, and dense dimensionalities (=0 + 2 + 0) but got 3
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90849
Approved by: https://github.com/amjames, https://github.com/cpuhrsch
2023-01-11 01:05:14 +00:00
a7749ae177 [reland] rename DisableTorchFunction to DisableTorchFunctionSubclass (#88218) (#89221)
Summary: First half of #87990. This doesn't change any of the behavior and is just a rename

#88218 got reverted for internal breakages. This is the reland of started from internal

Differential Revision:
D41268423

LaMa Project: L1098534

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89221
Approved by: https://github.com/meliy-meyada, https://github.com/zou3519
2023-01-04 18:32:49 +00:00
8b617f813d [cuBLAS] Add an option to disable reduced precision reductions for BF16 GEMM (#89172)
Essentially the same change as #67946, except that the default is to disallow reduced precision reductions in `BFloat16` GEMMs (for now). If performance is severely regressed, we can change the default, but this option appears to be necessary to pass some `addmm` `BFloat16` tests on H100.

CC @ptrblck @ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89172
Approved by: https://github.com/ngimel
2022-12-21 18:58:28 +00:00
28ceccec21 cleanup old python_compat code (#91162)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91162
Approved by: https://github.com/ezyang
2022-12-20 18:13:19 +00:00
0eb45d546c Bind autograd current Node for debugging purposes (#90867)
This allows to know at any point during the backward pass what is running and where the Node currently running was created at:
```python
import torch
from torch.utils._python_dispatch import TorchDispatchMode
from torch.autograd import detect_anomaly

class MyMode(TorchDispatchMode):
    def __torch_dispatch__(self, func, types, args, kwargs=None):
        node = torch._C._current_autograd_node()
        print(f"Running {func} from within {node}")
        if node is not None:
            print("The Node was created at:")
            print("\n  ".join(node.metadata["traceback_"]))
        return func(*args, **kwargs or {})

with MyMode(), detect_anomaly():
    print("FW")
    a = torch.rand(10, requires_grad=True)
    b = a.mul(2)
    b = b.div(3)
    b = b.sum()
    print("BW")
    b.backward()
```

Gives
```
$ python foo.py
foo.py:15: UserWarning: Anomaly Detection has been enabled. This mode will increase the runtime and should only be enabled for debugging.
  with MyMode(), detect_anomaly():
FW
Running aten.rand.default from within None
Running aten.mul.Tensor from within None
Running aten.div.Tensor from within None
Running aten.sum.default from within None
BW
Running aten.ones_like.default from within None
Running aten.expand.default from within <SumBackward0 object at 0x7fa40c0c6dc0>
The Node was created at:
  File "foo.py", line 20, in <module>
    b = b.sum()

Running aten.isnan.default from within <SumBackward0 object at 0x7fa40c0c6500>
The Node was created at:
  File "foo.py", line 20, in <module>
    b = b.sum()

Running aten.any.default from within <SumBackward0 object at 0x7fa32b23a780>
The Node was created at:
  File "foo.py", line 20, in <module>
    b = b.sum()

Running aten._local_scalar_dense.default from within <SumBackward0 object at 0x7fa40c0c9190>
The Node was created at:
  File "foo.py", line 20, in <module>
    b = b.sum()

Running aten.div.Tensor from within <DivBackward0 object at 0x7fa40c0c9190>
The Node was created at:
  File "foo.py", line 19, in <module>
    b = b.div(3)

Running aten.isnan.default from within <DivBackward0 object at 0x7fa40c0c9190>
The Node was created at:
  File "foo.py", line 19, in <module>
    b = b.div(3)

Running aten.any.default from within <DivBackward0 object at 0x7fa40c0c9190>
The Node was created at:
  File "foo.py", line 19, in <module>
    b = b.div(3)

Running aten._local_scalar_dense.default from within <DivBackward0 object at 0x7fa40c0c9190>
The Node was created at:
  File "foo.py", line 19, in <module>
    b = b.div(3)

Running aten.mul.Tensor from within <MulBackward0 object at 0x7fa40c0c9190>
The Node was created at:
  File "foo.py", line 18, in <module>
    b = a.mul(2)

Running aten.isnan.default from within <MulBackward0 object at 0x7fa40c0c9190>
The Node was created at:
  File "foo.py", line 18, in <module>
    b = a.mul(2)

Running aten.any.default from within <MulBackward0 object at 0x7fa40c0c9190>
The Node was created at:
  File "foo.py", line 18, in <module>
    b = a.mul(2)

Running aten._local_scalar_dense.default from within <MulBackward0 object at 0x7fa40c0c9190>
The Node was created at:
  File "foo.py", line 18, in <module>
    b = a.mul(2)

Running aten.detach.default from within <AccumulateGrad object at 0x7fa40c0c9730>
The Node was created at:
  File "foo.py", line 18, in <module>
    b = a.mul(2)

Running aten.detach.default from within <AccumulateGrad object at 0x7fa40c0c94b0>
The Node was created at:
  File "foo.py", line 18, in <module>
    b = a.mul(2)

```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90867
Approved by: https://github.com/soulitzer
2022-12-20 13:41:43 +00:00
3859aace20 [MPS] Skip tests broken on Ventura (#90843)
Also add `torch.backends.mps.is_macos13_or_newer`
See https://github.com/pytorch/pytorch/issues/85758

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90843
Approved by: https://github.com/kulinseth, https://github.com/albanD
2022-12-14 19:51:00 +00:00
4b1053497c [vmap] Prepend "legacy" to files for old vmap implementation (#90324)
We have an older torch.vmap implementation. It is no longer supported.
It still needs to exist somewhere for the sake of BC with
torch.autograd.functional.

This PR makes it clear what files are meant for implementing the old
vmap implementation. I've seen a couple of PRs recently adding support
for the old vmap implementation, so this will lessen the confusion.

Test Plan:
- CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90324
Approved by: https://github.com/samdow
2022-12-07 18:46:15 +00:00
4908a12542 Reland "SymIntify convolution backend calculation (#89069)"" (#89142)
This reverts commit 90db86be108184a6c86c73e1b01012352c72e66b.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89142
Approved by: https://github.com/albanD, https://github.com/malfet
2022-11-16 21:41:47 +00:00
90db86be10 Revert "SymIntify convolution backend calculation (#89069)"
This reverts commit 09ed8b67e24cfe29f3fa7b5dd28eaa7749229f12.

Reverted https://github.com/pytorch/pytorch/pull/89069 on behalf of https://github.com/DanilBaibak due to breaking internal builds
2022-11-16 16:36:27 +00:00
09ed8b67e2 SymIntify convolution backend calculation (#89069)
We will need this to implement a convolution meta function that
is SymInt aware.  I use templates so that regular convolution code
is not affected by the change.  No tests for symbolic ints directly; that will
come in a subsequent PR which also needs to refactor fake tensors.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89069
Approved by: https://github.com/SherlockNoMad
2022-11-16 14:02:43 +00:00
ba4d5aae06 Revert "rename DisableTorchFunction to DisableTorchFunctionSubclass (#88218)"
This reverts commit 7f28be10e5e71efda37800384fa897785499bed1.

Reverted https://github.com/pytorch/pytorch/pull/88218 on behalf of https://github.com/izaitsevfb due to BC-breaking change, D41211901
2022-11-11 19:13:05 +00:00
4e5d7afe84 Revert "add DisableTorchFunction that matches DisableTorchDispatch (#88219)"
This reverts commit c0ecce15b5a54ff0185f9976e6bfb6f3a7de698d.

Reverted https://github.com/pytorch/pytorch/pull/88219 on behalf of https://github.com/izaitsevfb due to BC-breaking change, D41211901
2022-11-11 19:08:30 +00:00
c0ecce15b5 add DisableTorchFunction that matches DisableTorchDispatch (#88219)
Closes #87990. This implements a new disable guard that matches DisableTorchDispatch (disables all subclasses and modes)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88219
Approved by: https://github.com/ezyang
2022-11-10 14:51:13 +00:00
7f28be10e5 rename DisableTorchFunction to DisableTorchFunctionSubclass (#88218)
First half of #87990. This doesn't change any of the behavior and is just a rename

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88218
Approved by: https://github.com/ezyang, https://github.com/zou3519
2022-11-10 14:51:13 +00:00
eb9b156019 [fix] MathBits: serialization (#88182)
Fixes #81690

TODO:

* [x] C++ Unpickler Fix (locally tested pickled in Python and unpickled in C++)
* [x] C++ Pickler Fix (locally tested pickled in C++ and unpickled in Python)
* [x] Do quant_tensor, sparse_tensor, etc require similar changes? (Sparse and Quant don't need this)
* [x] Add Comments
* [x] How to make sure C++ and Python are in sync? (Functions in `pickler.h` help in getting and setting Tensor Metadata (math-bits for now) on a tensor. They are the only place which should handle this.)

Notes:
Quant Tensor don't support complex dtypes and for float they segfault with `_neg_view` : https://github.com/pytorch/pytorch/issues/88484

Sparse Tensor:
```python
>>> a = torch.tensor([[0, 2.], [3j, 0]]).to_sparse()
>>> a.conj().is_conj()
False
>>> a._neg_view()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NotImplementedError: Cannot access storage of SparseTensorImpl
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88182
Approved by: https://github.com/ezyang, https://github.com/anjali411
2022-11-09 17:15:12 +00:00
e1c123d29a Add UBSAN to ASAN (#88055)
Add undefined behavior sanitizer to `USE_ASAN` option.
Added `torch._C._crash_if_vptr_ubsan()` that only fails if vptr belongs to a wrong class after typecast
Deleted all ubsan supressions, but disabled `ProtoTest::Basic` as it fails above-mentioned vptr check.

Fixes https://github.com/pytorch/pytorch/issues/88042
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88055
Approved by: https://github.com/ezyang
2022-11-01 17:59:35 +00:00
fc21b9db23 Use Eager Code To Determine Conv Layout (#87305)
The logic for determine conv backend and therefore output striding is very complex. It depends on build settings, input striding/contiguity, sizes, etc. Eventually we should port that logic to the meta impl for dynamic shapes but that will require a lot more work and keeping the implementations in sync. See https://github.com/pytorch/torchdynamo/issues/1701

This is a prerequisite to removing the inductor conv stride propagation and more general fake tensor for inductor propagation. In that PR, the meta impls for cpu conv give incorrect striding which led to test failures (https://github.com/pytorch/pytorch/pull/87083).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87305
Approved by: https://github.com/ezyang
2022-10-28 16:37:04 +00:00
35c611d30f Add mem efficient backend flag (#87946)
# Summary
Add in a torch.backends.cuda flag and update context manager to pic between the three implementations of the scaled_dot_product_attention.

cc @cpuhrsch @jbschlosser @bhosmer @mikaylagawarecki
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87946
Approved by: https://github.com/cpuhrsch
2022-10-28 15:51:10 +00:00
adb76ef510 Expose API for backward execution order (#87507)
In this PR:
- graph_task stores graph roots on construction so that we can later traverse through the graph
- before the nodes are returned, they needed to be converted from raw_ptr to shared_ptr, and this should be OK because the graph is guaranteed to be alive

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87507
Approved by: https://github.com/albanD
2022-10-26 21:28:45 +00:00
ce0c6e828e Reland "add an API for external backends to register custom device names (#86992)" (#87453)
Re-land of https://github.com/pytorch/pytorch/pull/86992

This reverts commit a895af92506f206889610251624590798d0deabd.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87453
Approved by: https://github.com/ezyang, https://github.com/albanD
2022-10-21 16:51:36 +00:00
a895af9250 Revert "add an API for external backends to register custom device names (#86992)"
This reverts commit fb6826bfd82660aa905459f894c81d97d143dd2c.

Reverted https://github.com/pytorch/pytorch/pull/86992 on behalf of https://github.com/jeanschmidt due to breaking internal builds - D40534212 - arstudio-windows-tests-landcastle-0
2022-10-20 14:51:08 +00:00
fb6826bfd8 add an API for external backends to register custom device names (#86992)
This API adds some improvements to external backends who are building C++ backends out of tree using the `PrivateUse1` dispatch key.

The docs and linked examples go over the API in more detail, but you should be able to use it like:
```
# This should probably be in the __init__.py file of a external backend's python package
> torch.register_privateuse1_backend("foo")`
# And it will allow the user to do this:
> a = torch.ones(2, device="foo")
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/86992
Approved by: https://github.com/albanD
2022-10-19 16:44:17 +00:00
1dbc8ad3b7 Add Warning class and refactor C++ warnings to use it (#84101)
Also adds `TORCH_WARN_WITH` and `TORCH_WARN_DEPRECATION` macros

Part of #72948

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84101
Approved by: https://github.com/albanD
2022-10-18 20:02:42 +00:00
f1fdb6efbd Manual changes for moving dynamo to core (#86621)
This is the subset of the changes in #86461 not auto-generated by `copy_to_core.sh`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86621
Approved by: https://github.com/albanD
2022-10-11 23:01:21 +00:00
d3f7c34cb3 Enable aten-aten decomps (#85921)
Invokes aten-aten decomps with re-entrant FakeMode. These decomps are being used in other places, so it's good to unify the path static fake tensor takes / get additional testing etc. There is also an instance where we return different devices with cpu/cuda which this fixes ([batch_norm](https://github.com/pytorch/pytorch/blob/master/torch/_decomp/decompositions.py#L1374))

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85921
Approved by: https://github.com/ezyang
2022-10-08 05:12:42 +00:00
7ec12a559c Revert "Enable aten-aten decomps (#85921)"
This reverts commit 62e4f51efdf98a3a91d29efa55e5665d5398b464.

Reverted https://github.com/pytorch/pytorch/pull/85921 on behalf of https://github.com/huydhn due to Sorry for reverting your PR. I think it breaks a dynamo test in trunk 62e4f51efd
2022-10-08 01:59:54 +00:00
ba3fde6aa0 Add multi-grad hooks (#86260)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86260
Approved by: https://github.com/albanD
2022-10-07 21:16:45 +00:00
62e4f51efd Enable aten-aten decomps (#85921)
Invokes aten-aten decomps with re-entrant FakeMode. These decomps are being used in other places, so it's good to unify the path static fake tensor takes / get additional testing etc. There is also an instance where we return different devices with cpu/cuda which this fixes ([batch_norm](https://github.com/pytorch/pytorch/blob/master/torch/_decomp/decompositions.py#L1374))

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85921
Approved by: https://github.com/ezyang
2022-10-07 21:04:39 +00:00
936e93058b Delete torch::deploy from pytorch core (#85953)
As we have migrated torch::deploy over to https://github.com/pytorch/multipy, we can now delete it from pytorch core as ongoing development will happen there.

This PR was created due to syncing issues with https://github.com/pytorch/pytorch/pull/85443 which is where the review history can be found.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85953
Approved by: https://github.com/seemethere, https://github.com/malfet
2022-10-06 07:20:16 +00:00
9da5646cdb Add device logic handling for functions which allow scalar inputs as tensors (#86149)
Some functions allow scalars as tensor inputs. Add handling for them in device logic.

Fix for https://github.com/pytorch/torchdynamo/issues/1445
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86149
Approved by: https://github.com/ezyang, https://github.com/bdhirsh
2022-10-04 18:54:00 +00:00
cd6477617c Custom sdp implementations dense (#85984)
# Summary

- This code creates the runtime dispatch system for choosing a performant fused SDP kernel. The only choice of fused kernel is flash_attention. It also creates python flags and a context manager that can be used to turn off and on behavior for dispatch.
- This also adds support for flash_attention with dense tensors.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85984
Approved by: https://github.com/cpuhrsch
2022-10-03 17:36:37 +00:00
f183a989a2 Fix fake tensor kernel nesting (#85920)
If you e.g. printed within a decomp which would call `in_kernel_invocation_manager`, on the exit from the manager it would unilaterally remove meta from the tls / set the tensor to return its real device. We should just restore what the existing state was.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85920
Approved by: https://github.com/ezyang, https://github.com/bdhirsh, https://github.com/huydhn
2022-10-02 04:19:40 +00:00
b562987c28 Revert "Fix fake tensor kernel nesting (#85920)"
This reverts commit c2d9ea7f4b54c7d4332bc457fd76238c61f129de.

Reverted https://github.com/pytorch/pytorch/pull/85920 on behalf of https://github.com/huydhn due to Sorry for reverting your PR but I suspect that it causes a flaky memory leak issue in TestFakeTensorCUDA.test_fake_crossref_backward_amp_linalg_lstsq_cuda_float32
2022-10-01 19:30:21 +00:00
c2d9ea7f4b Fix fake tensor kernel nesting (#85920)
If you e.g. printed within a decomp which would call `in_kernel_invocation_manager`, on the exit from the manager it would unilaterally remove meta from the tls / set the tensor to return its real device. We should just restore what the existing state was.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85920
Approved by: https://github.com/ezyang, https://github.com/bdhirsh
2022-09-30 23:11:20 +00:00
a876432aea Expose torch._will_engine_execute_node (#84773)
Addresses: https://github.com/pytorch/pytorch/issues/83617

This PR a way to query the TLS graph task's exec_info which is a map mapping the Node to a bool indicating whether it will be executed in the current backward pass (as determined by the inputs= argument for .grad of .backward).
- this works with both custom Function nodes and normal codegened nodes
-  to be able to verify whether the pyobject passed is an actual node, we now store pointers to PyTypeObjects into a set on registration.
- error out when .backward without inputs= to avoid silently returning True

Alternatives:
- not sure if it is possible to bind to Python from a raw pointer to Node. At least we wouldn't be able to use existing logic, and the Python object should only hold a weak reference to the Node.
- other solutions to the motivating issue seem to require more extensive modification to the engine

See the issue linked for an example of usage
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84773
Approved by: https://github.com/albanD
2022-09-28 20:13:52 +00:00