Commit Graph

413 Commits

Author SHA1 Message Date
d14d62b7aa [dynamo] add more refleak tests (#120657)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120657
Approved by: https://github.com/jansel
2024-03-07 22:25:43 +00:00
aa0b0944d5 [dynamo] Re-dispatch torch.Tensor.new into torch.Tensor.new_empty method. (#121075)
Fix: https://github.com/pytorch/xla/issues/6009

This PR adds another case to `TensorVariable.method_new` special case, where it
re-dispatches `new` into `new_empty`.

Since we are using fake tensors, the `new` call doesn't actually gets to the corresponding
backend (e.g. XLA). So, things like the following might happen:

```python
@torch.compile(backend="openxla")
def foo(x):
    new_x = x.new(*x.size())

    # new_x.device() == "xla"
    # x.device() == "xla:0"

    return new_x + x

a = torch.arange(10)
foo(a.to(xm.xla_device()))
```

Resulting in the following error:

```python
Traceback (most recent call last):
  ...
  File "torch/_dynamo/utils.py", line 1654, in get_fake_value
    ret_val = wrap_fake_exception(
  File "torch/_dynamo/utils.py", line 1190, in wrap_fake_exception
    return fn()
  File "torch/_dynamo/utils.py", line 1655, in <lambda>
    lambda: run_node(tx.output, node, args, kwargs, nnmodule)
  File "torch/_dynamo/utils.py", line 1776, in run_node
    raise RuntimeError(make_error_message(e)).with_traceback(
  File "torch/_dynamo/utils.py", line 1758, in run_node
    return node.target(*args, **kwargs)
  File "torch/utils/_stats.py", line 20, in wrapper
    return fn(*args, **kwargs)
  File "torch/_subclasses/fake_tensor.py", line 885, in __torch_dispatch__
    return self.dispatch(func, types, args, kwargs)
  File "torch/_subclasses/fake_tensor.py", line 1224, in dispatch
    return self._cached_dispatch_impl(func, types, args, kwargs)
  File "torch/_subclasses/fake_tensor.py", line 955, in _cached_dispatch_impl
    output = self._dispatch_impl(func, types, args, kwargs)
  File "torch/_subclasses/fake_tensor.py", line 1445, in _dispatch_impl
    return self.wrap_meta_outputs_with_default_device_logic(
  File "torch/_subclasses/fake_tensor.py", line 1575, in wrap_meta_outputs_with_default_device_logic
    return tree_map(wrap, r)
  File "torch/utils/_pytree.py", line 900, in tree_map
    return treespec.unflatten(map(func, *flat_args))
  File "torch/utils/_pytree.py", line 736, in unflatten
    leaves = list(leaves)
  File "torch/_subclasses/fake_tensor.py", line 1550, in wrap
    ) = FakeTensor._find_common_device(func, flat_args)
  File "torch/_subclasses/fake_tensor.py", line 625, in _find_common_device
    merge_devices(arg)
  File "torch/_subclasses/fake_tensor.py", line 620, in merge_devices
    raise RuntimeError(
torch._dynamo.exc.TorchRuntimeError: Failed running call_function <built-in function add>(*(FakeTensor(..., device='xla', size=(10,), dtype=torch.int64), FakeTensor(..., device='xla:0', size=(10,), dtype=torch.int64)), **{}):
Unhandled FakeTensor Device Propagation for aten.add.Tensor, found two different devices xla, xla:0
```

Using `new_empty`, instead, fixes this error because it uses the device from the source
tensor, instead of inferring from the current dispatch key set.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/121075
Approved by: https://github.com/jansel
2024-03-06 11:49:27 +00:00
35004b8ab4 [dynamo] Fix handling of invalid args (#121110)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121110
Approved by: https://github.com/yanboliang
ghstack dependencies: #121106
2024-03-05 17:16:04 +00:00
d534a49767 Reinplace auto_functionalized (#120829)
Fixes https://github.com/pytorch/pytorch/issues/120441

We follow how triton_kernel_wrapper_functional gets re-inplaced:
- If we see auto_functionalized, then first we compute what inputs we
  actually need to clone ("tensors_to_clone") and fixup the graph. This happens in
  `reinplace_and_refine_tensors_to_clone`, which I have refactored out
  of the triton_kernel_wrapper_functional reinplacing code.
- Later on, after the reinplacing pass, we have a decomposition pass for
  auto_functionalized. In that decomposition pass, we make use of the
  "tensor_to_clone" info and only clone those inputs in the
  decomposition.
- We shepherd "tensor_to_clone" from the first step to the second step
  by setting the .meta field on the auto_functionalized node.

Test Plan:
- existing tests
- tested locally by reading the output of TORCH_LOGS="post_grad_graphs"
- added assertExpectedInline tests for the post_grad_graphs

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120829
Approved by: https://github.com/oulgen
2024-03-01 00:55:19 +00:00
e7039e3a0b [dynamo][easy] Dynamo test changes (#120927)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120927
Approved by: https://github.com/yanboliang
ghstack dependencies: #120864, #120730
2024-02-29 22:05:41 +00:00
f94933ed42 Refine value ranges on inequalities (#120800)
This is basically done the obvious way. For better or worse, I jammed this into what used to be `_maybe_guard_eq` but now is `_maybe_guard_rel`. I was careful to test all the off by one conditions, and each permutation. Let me know if you think I missed anything. Importantly, this now works for unbacked SymInts.

While testing, I noticed we are silently duck sizing all symbolic variables in `test_dynamic_shapes.py`. This may or may not be covering up bugs.

Along the way, I had to fix a bug in export constraints, where we weren't checking that the final var_to_range was consistent with what the user requested at top level.

After I implemented all this, I realized that applying this to non-unbacked SymInts was duplicative with @ysiraichi's previous work on https://github.com/pytorch/pytorch/pull/97963 . The upside is I now understand what Yukio was trying to do in the original PR, and I think my new logic is simpler and less error prone. In Yukio's earlier diff, Yukio tried very hard to avoid changing what guards we actually issue (since this would cause tests to wobble). Thus, when he refined a range, he also saved the guard that actually caused the range to refine. In this PR, I don't bother saving these guards; instead I just tighten var_to_range directly and rely on generating guards on this to be correct. The key insight is that if I assert `x < y`, it's always safe to emit (potentially) more restrictive range guards, because this won't invalidate our guards, it will just make them a little too strong (but actually, I think we are precise along the way.) If these guards make it unnecessary to test `x < y`, because now the ranges for x and y are disjoint, this is fine, we've subsumed the x < y guard and can just not bother testing it. If I've gotten it right, TV will agree with me.

In fact, I had a bug in this PR which TV didn't catch, which is that when we have a recorded var_to_guards for a symbol, we unconditionally never generate the range guard for it, even if the var_to_guards is potentially inconsistent with var_to_range (because var_to_range was updated separately). With var_to_guards removed, I don't have to worry abou this inconsistency.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120800
Approved by: https://github.com/Skylion007, https://github.com/avikchaudhuri, https://github.com/ysiraichi
2024-02-29 19:41:51 +00:00
4b18ab869f [torch.export] Support is_compiling() flag for non-strict mode (#119602)
Summary: In non-strict mode of torch.export() we didn't set those `is_compiling()` to `True` which is needed by some models.

Test Plan: Unit tests and manual testing.

Differential Revision: D53624452

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119602
Approved by: https://github.com/suo
2024-02-29 05:52:51 +00:00
01ec8df6d8 [Compiled Autograd] Introduce BackwardState capture (#120382)
This adds support for backwards hooks that are *both*:
1) Interior to the graph; and
2) Dynamically generated (e.g. lambdas)

We do this by creating a BackwardState object that is used to register the hooks in the forward, then populated by dynamo *after* the forwards runs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120382
Approved by: https://github.com/xmfan
2024-02-28 20:36:47 +00:00
e3d64c4d5d [dynamo] Desugar accumulate_grad, fix .grad handling (#120590)
Fixes https://github.com/pytorch/pytorch/issues/118435
Fixes https://github.com/pytorch/pytorch/issues/119906

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120590
Approved by: https://github.com/ezyang, https://github.com/jansel
ghstack dependencies: #120520
2024-02-27 10:12:26 +00:00
ecb3f33a1a [dynamo] fix segfault in _debug_get_cache_entry_list (#120635)
Fix https://github.com/pytorch/pytorch/issues/120607.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120635
Approved by: https://github.com/jansel
2024-02-26 23:31:09 +00:00
0f20cc1e0e Don't use size on TensorVariable when doing out resize test (#120567)
Fixes https://github.com/pytorch/pytorch/issues/120482
Fixes https://github.com/pytorch/pytorch/issues/120511

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120567
Approved by: https://github.com/Skylion007
2024-02-25 02:21:34 +00:00
a17979faa6 [dynamo] add stronger test for dynamo memory leaks (#120459)
This issue was raised by a regression of https://github.com/pytorch/pytorch/issues/112090 caused by https://github.com/pytorch/pytorch/pull/120147.

Make the memory leak test stronger by using weakref to check for model deletion instead of measuring CUDA memory allocation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120459
Approved by: https://github.com/jansel
2024-02-24 16:30:20 +00:00
edf1c4e552 [Dynamo] Handle guard_size_oblivious in user code (#120379)
Fixes https://github.com/pytorch/pytorch/issues/120083

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120379
Approved by: https://github.com/yanboliang
2024-02-23 05:38:57 +00:00
6665b96ebb Rewrite maybe_reduce more carefully for unbacked SymInt (#119562)
Fixes https://github.com/pytorch/pytorch/issues/119476

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119562
Approved by: https://github.com/albanD
ghstack dependencies: #119559
2024-02-13 21:40:06 +00:00
39c68efd85 [dynamo] Capture untyped_storage().resize_() (#119647)
This makes storage resizing work with `backend=eager`, the next two PRs make it work for inductor.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119647
Approved by: https://github.com/yf225
2024-02-13 19:03:28 +00:00
c2522554dd Prevent DCE'ing unbacked SymInt for view outputs (#119552)
Fixes https://github.com/pytorch/pytorch/issues/119414

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119552
Approved by: https://github.com/Skylion007, https://github.com/eellison
2024-02-13 16:32:21 +00:00
52de407b6c Avoid performing replacements when it would unrefine ranges (#117356)
Fixes https://github.com/pytorch/pytorch/issues/117268; check this issue for background.

This PR does the following:

* Do not perform a replacement if the expression we're replacing the symbol with has a less refined value range than the original. There's a little bit of trickiness around the handling for values close to INT64_MAX; when checking if a range refines another, I *only* consider the range representable in 64-bit integers. This is enough to prevent us from doing a substitution like `i0 = 10 - i1`, but it appears to still let us do the other substitutions we like, such as `i0 = i1` or `i0 = 12 * i1`
* The test above is order dependent: if we assert an equality BEFORE we have refined a range, we might be willing to do the replacement because there isn't a meaningful range. This means that it's important to mark things as sizes, before you start doing other error checking. `split_with_sizes` is adjusted accordingly. It would be good to raise an error if you get the ordering wrong, but I leave this to future work.
* It turns out this is not enough to fix AOTAutograd, because we lose the size-ness of unbacked SymInts when AOTAutograd retraces the Dynamo graph. So update deferred runtime assert insertion to also insert size-ness and value ranges annotations. Note that, in principle, it shouldn't be necessary to explicitly do the latter; these should just show up as deferred runtime asserts. That's some extra refactoring for a later day.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117356
Approved by: https://github.com/lezcano
2024-02-13 15:56:59 +00:00
472500e32a Revert "Avoid performing replacements when it would unrefine ranges (#117356)"
This reverts commit 0e6b314fc2e7c965717e939a4e457a9b9d7e133e.

Reverted https://github.com/pytorch/pytorch/pull/117356 on behalf of https://github.com/huydhn due to Sorry for reverting the change but it looks like the forward fix still needs more work https://github.com/pytorch/pytorch/pull/119712, so it would be cleaner to reland them ([comment](https://github.com/pytorch/pytorch/pull/117356#issuecomment-1940032407))
2024-02-13 01:16:58 +00:00
0e6b314fc2 Avoid performing replacements when it would unrefine ranges (#117356)
Fixes https://github.com/pytorch/pytorch/issues/117268; check this issue for background.

This PR does the following:

* Do not perform a replacement if the expression we're replacing the symbol with has a less refined value range than the original. There's a little bit of trickiness around the handling for values close to INT64_MAX; when checking if a range refines another, I *only* consider the range representable in 64-bit integers. This is enough to prevent us from doing a substitution like `i0 = 10 - i1`, but it appears to still let us do the other substitutions we like, such as `i0 = i1` or `i0 = 12 * i1`
* The test above is order dependent: if we assert an equality BEFORE we have refined a range, we might be willing to do the replacement because there isn't a meaningful range. This means that it's important to mark things as sizes, before you start doing other error checking. `split_with_sizes` is adjusted accordingly. It would be good to raise an error if you get the ordering wrong, but I leave this to future work.
* It turns out this is not enough to fix AOTAutograd, because we lose the size-ness of unbacked SymInts when AOTAutograd retraces the Dynamo graph. So update deferred runtime assert insertion to also insert size-ness and value ranges annotations. Note that, in principle, it shouldn't be necessary to explicitly do the latter; these should just show up as deferred runtime asserts. That's some extra refactoring for a later day.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117356
Approved by: https://github.com/lezcano
2024-02-09 14:43:58 +00:00
b251bca205 [dynamo] inlining into __iter__ of user defined object (#119243)
Fixes #119198.

This PR make dynamo inline `__iter__` of a user defined object instead of creating a graph break. Also added a new test, which shows:
1. the loop is unrolled
2. the length of the loop is guarded when inlining `__iter__`
```python
class Mod:
    def __init__(self):
        self.a = [torch.randn(2, 2), torch.randn(2, 2)]

    def __iter__(self):
        return iter(self.a)

def f(mod):
    ret = []
    for x in mod:
        ret.append(x + 1)
    return ret
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119243
Approved by: https://github.com/jansel
2024-02-08 17:07:30 +00:00
ee1c2449f7 [dynamo] delete dynamo cache entry when guard function is invalidated [attempt 2] (#119107)
Attempt #2 for https://github.com/pytorch/pytorch/pull/117875 to fix https://github.com/pytorch/pytorch/issues/112090.

Summary of changes:
- ~Changed CacheEntry linked list into a doubly-linked list structure to support deletion.~ (done by C++ refactor)
- Added CacheEntry and ExtraState borrowed references to GuardFn so that GuardFn can tell ExtraState to delete CacheEntry when the GuardFn is invalidated.
- ~Added ExtraState raw reference to CacheEntry so that we can get ExtraState to correctly point to the first CacheEntry if it gets deleted.~ (done by C++ refactor)
- CacheEntry destructor needs to reset GuardFn refs to ExtraState/CacheEntry in order to prevent use-after-free.
- code_context values that are nn.GraphModules need to be weakrefs in order to prevent circular references.
- Added tests that check for memory leaks and cache deletion operations.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119107
Approved by: https://github.com/jansel
2024-02-07 03:32:42 +00:00
ae4e866bba [dynamo] refactor CacheEntry and ExtraState to eval_frame.c to C++ (#118438)
Part of implementing CacheEntry invalidation to fix https://github.com/pytorch/pytorch/issues/112090.

Changes:
- Move CacheEntry and ExtraState to C++
- Use pybind to control reference counting
- Use std::list instead of manually implementing a linked list

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118438
Approved by: https://github.com/jansel
2024-02-06 20:48:11 +00:00
3f0fd36835 Introduce size oblivious guards (#118579)
Fixes https://github.com/pytorch/pytorch/issues/117361

The implementation here slightly diverges from what was proposed in the issue, so I will recap what this PR is doing here. Today, when doing computations involving size-like unbacked SymInts, we assume for all operations that the compile time range of the integer is `[2, inf]`, even though at runtime we also accept zero and one.

This PR removes the carte blanche assumption, and instead does the analysis in a much more limited and controlled fashion: only for guards which we have designated as "size oblivious" are we willing to do the analysis under the assumption that the range of all size-like unbacked SymInts is `[2, inf]`; otherwise, we will faithfully only do analysis with `[0, inf]` (or whatever the user provided) bounds.

The infra pieces of this PR are:

* Remove runtime_var_to_range from torch/fx/experimental/symbolic_shapes.py; modify `_constrain_range_for_size` to refine the range without clamping min to 2, and instead add the symbol to a `size_like` set in the ShapeEnv
* When evaluating an expression, if the expression is requested to be evaluated in a `size_oblivious` way, we attempt to statically compute the value of the expression with the assumption that all symbols in `size_like` are updated to assume that they are `>= 2`.
* Add Python and C++ APIs for guarding on a SymBool in a size-oblivious way. In C++, I also need to add some helpers for performing symbolic comparisons, since the stock comparisons immediately specialize in the "normal" way.

The rest of the changes of the PR are marking various spots in PyTorch framework code as size oblivious, based on what our current test suite exercises.

As you review the places where we have marked things as size oblivious, it may become clear why I ended up not opting for the "designate a branch as the default branch when it's not statically obvious which way to go": for some of the conditions, this answer is rather non-obvious. I think potentially there is another refinement on top of this PR, which is something like "I don't care if you can't figure it out with ValueRange analysis, go down this path anyway if there are unbacked sizes involved." But even if we add this API, I think we are obligated to attempt the ValueRange analysis first, since it can lead to better outcomes sometimes (e.g., we are able to figure out that something is contiguous no matter what the unbacked size is.)

When is it permissible to mark something as size oblivious? Heuristically, it is OK anywhere in framework code if it gets you past a guard on unbacked SymInt problem. It is somewhat difficult to provide a true semantic answer, however. In particular, these annotations don't have any observational equivalence guarantee; for example, if I have `torch.empty(u0, 1).squeeze()`, we will always produce a `[u0]` size tensor, even though if `u0 == 1` PyTorch will actually produce a `[]` size tensor. The argument that I gave to Lezcano is that we are in fact defining an alternate semantics for a "special" size = 0, 1, for which we have these alternate eager mode semantics. In particular, suppose that we have a constant `special1` which semantically denotes 1, but triggers alternate handling rules. We would define `torch.empty(special1, 1).squeeze()` to always produce a `[special1]` size tensor, making its semantics coincide with unbacked SymInt semantics. In this model, the decision to designate guards as size oblivious is simply a user API question: you put them where ever you need some handling for special1! As we conservatively error out whenever it is not obvious what `special1` semantics should be, it is always valid to expand these semantics to cover more cases (although you can always choose the wrong semantics!)

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118579
Approved by: https://github.com/eellison, https://github.com/lezcano
2024-02-06 19:45:32 +00:00
86d5d1650b [dynamo] support dict.clear() (#119197)
For code like following:
```python
import torch
def f():
    a = {"a": torch.randn(2, 2)}
    a.clear()
    return a
torch.compile(f, backend="eager", fullgraph=True)()
```

We have a graph break before the pr:
```
torch._dynamo.exc.Unsupported: call_method ConstDictVariable() clear [] {}
```

Test Plan:
Added new tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119197
Approved by: https://github.com/jansel, https://github.com/anijain2305
2024-02-06 01:17:55 +00:00
a3770bcf10 Add functools.partial and UserDefinedFunction to dict keys (#118199)
This is tested by `fullgraph=True` in the `test_getattr_dict` test.
I can write a one-off test for both if that's needed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118199
Approved by: https://github.com/peterbell10, https://github.com/jansel, https://github.com/anijain2305
ghstack dependencies: #117982, #118098, #117983, #117625, #118194, #118003, #118208
2024-02-02 14:42:35 +00:00
188628d99e [dynamo,easy] Add Typing variable to possible dict keys (#118003)
With this one, the only keys we are not tracing properly in the
(non-skipped) test suite are `OutDtypeHigherOrderVariable()`, and a
couple `UserDefinedObjectVariables`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118003
Approved by: https://github.com/anijain2305, https://github.com/Skylion007, https://github.com/jansel
ghstack dependencies: #117982, #118098, #117983, #117625, #118194
2024-02-02 14:40:21 +00:00
ecf7d0e8ac Make dict guards amenable to the CSE pass (#118194)
Supersedes https://github.com/pytorch/pytorch/pull/118096 as a much cleaner and simpler solution.

It is difficult to write a test for this one without exposing too much
of the internals. You can see empirically that it works by running
```
TORCHDYNAMO_PRINT_GUARDS=1 TORCH_LOGS=+guards  python test/test_optim.py -k test_can_load_older_state_dict_ASGD_cpu_float32
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118194
Approved by: https://github.com/jansel, https://github.com/peterbell10
ghstack dependencies: #117982, #118098, #117983, #117625
2024-02-02 14:38:48 +00:00
eb2bdfae88 Make variables in dict LazyTrackers (not lazily guarded yet) and avoid using DICT_KEYS guard (#117625)
Make variables in dict lazy and remove DICT_KEYS guard.

We build the keys of a dict depth-first and we rely on the guards of
each element in the dict to create the correct guards. This allows us to
remove the rather buggy DICT_KEYS guard and make the guard lazy.
The guards are not completely lazy yet, as we instantiate them in
`_HashableTracker._eq_impl` but it should be possible to make them
truly lazy.

Also, adding new types to the supported types within keys should be less
error prone.

This is marginally less efficient when we graph break, but in turn we
should graph break much less. It also  makes the dicts code easier to maintain
(removes `is_hashable_python_var`).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117625
Approved by: https://github.com/jansel, https://github.com/peterbell10, https://github.com/anijain2305
ghstack dependencies: #117982, #118098, #117983
2024-02-02 14:38:08 +00:00
b1da929df9 Use SourcelesBuilder in BuiltinVariable (#118098)
This was failing when fetching a dictionary from a module

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118098
Approved by: https://github.com/peterbell10, https://github.com/anijain2305
ghstack dependencies: #117982
2024-02-02 14:37:23 +00:00
41dfdde9f5 Handle some numpy functions with out arguments correctly in dynamo (#118248)
Dynamo creates Tensors when tracing through numpy ufuncs like np.sin, np.minimum etc. When running, np functions generally return Tensors when run with `torch.compile`. However, we currently require when normalizing `out` arguments that the input is an ndarray.  This creates assertion errors when running torch.compile on any numpy function with an out argument:
```
    def test_numpy_ufunc_out(self):
        @torch.compile(backend="eager")
        def foo():
            x = np.arange(5)
            out = np.empty((x.shape[0], x.shape[0]))
            res_out = np.sin(x, out=out)
            assert res_out is out
        foo()
```
Failure with stack trace: https://gist.github.com/jamesjwu/68e217638d735678b3de968584dba23f

Instead, we can wrap tensors in an ndarray in normalize_outarray to handle the case correctly. Fixing this resolves ~220 tests under dynamo_test_failures, but also exposes a followup bug.

In the presence of a graph break, ndarrays don't preserve their id, which can affect assertions and `is` checks between numpy arrays:
```
     def test_x_and_out_broadcast(self, ufunc):
        x = self.get_x(ufunc)
        out = np.empty((x.shape[0], x.shape[0]))

        x_b = np.broadcast_to(x, out.shape)
        # ufunc is just np.sin here
        res_out = ufunc(x, out=out)
        res_bcast = ufunc(x_b)
        # passes
        assert res_out is out
        graph_break()
        # fails
        assert res_out is out
```
Regular tensors preserve their id because Dynamo caches their example tensor values across a graph break. However, with ndarrays, we only store their converted tensor values, and construct new ndarrays around those values:
eebe7e1d37/torch/_dynamo/variables/builder.py (L1083)
Added a test with expected failure to showcase this — we can then fix that issue separately.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118248
Approved by: https://github.com/lezcano
2024-01-29 09:09:21 +00:00
2ed1b1747a Fix Auto Functionalize to handle specified default values (#118331)
Summary: When there were optionals with specified default values the code was improperly handling the number of parameters causing IndexError: tuple index out of range.

Test Plan: New tests.

Reviewed By: zou3519

Differential Revision: D53095812

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118331
Approved by: https://github.com/zou3519
2024-01-26 20:31:38 +00:00
eb054cc012 Revert "Fix Auto Functionalize to handle specified default values (#118035)"
This reverts commit 2d7a360911fb7b27be82c51ca86b4b34b6f1b087.

Reverted https://github.com/pytorch/pytorch/pull/118035 on behalf of https://github.com/zou3519 due to needs internal changes, reverting so we can land via co-dev ([comment](https://github.com/pytorch/pytorch/pull/118035#issuecomment-1910706841))
2024-01-25 17:53:15 +00:00
2d7a360911 Fix Auto Functionalize to handle specified default values (#118035)
Summary: When there were optionals with specified default values the code was improperly handling the number of parameters causing `IndexError: tuple index out of range`

Test Plan: new tests

Differential Revision: D52977644

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118035
Approved by: https://github.com/williamwen42
2024-01-25 01:22:12 +00:00
903e1913ff Rename unbacked SymInt prefix to u (#117859)
Currently, it conflicts with Inductor's naming convention for index
variables

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117859
Approved by: https://github.com/lezcano, https://github.com/jansel, https://github.com/avikchaudhuri
2024-01-22 20:53:47 +00:00
57ca455471 [dynamo] Add hasattr support for TupleVariable (#117694)
Summary:
This change adds support hasattr support for TupleVariable in dynamo.

This fix is part of: https://github.com/pytorch/pytorch/issues/117670

Test Plan: Unit test and CI

Differential Revision: D52850665

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117694
Approved by: https://github.com/yanboliang
2024-01-18 07:47:43 +00:00
bac0878780 Error if compiled nondeterministic backward called in deterministic mode (#114780)
Part of #113707

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114780
Approved by: https://github.com/ezyang, https://github.com/albanD
2024-01-15 22:45:40 +00:00
f2f47c6848 [dynamo] realize LazyVT's in DICT_MERGE (#117282)
Fixes https://github.com/pytorch/pytorch/issues/115029.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117282
Approved by: https://github.com/jansel, https://github.com/mlazos
2024-01-13 01:50:39 +00:00
cb42bc705b Make auto_functionalized HOP fallback in inductor (#117084)
It looks like the inductor fallback previously worked with HOPs but no longer
does, so I fixed that:
- all HOPs are exposed under torch.ops.higher_order, so I changed how
  inductor looks them up
- the inductor fallback assumed that an operator's signature was (*args,
  **kwargs). This is true for all the OpOverloads but not HOPs. I
  rewrote the code to not rely on this.

Test Plan:
- existing tests
- new test for auto_functionalized HOP.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117084
Approved by: https://github.com/williamwen42
2024-01-12 17:57:01 +00:00
04f788f925 Unflake test_auto_functionalize (#117076)
feat better cleanup of the custom op.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117076
Approved by: https://github.com/bdhirsh
2024-01-10 14:37:52 +00:00
4f3d698cac Impl. call_hasattr for BaseUserFunctionVariable (#116049)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116049
Approved by: https://github.com/zou3519
2024-01-09 22:58:58 +00:00
4c0d63180a Support NNModules as dict keys (#116723)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116723
Approved by: https://github.com/lezcano
2024-01-09 03:32:47 +00:00
83e8a0721d Reland #111196 (take 4) "Support tensors as Dict keys" (#116934)
Fixes #ISSUE_NUMBER

See that PR

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116934
Approved by: https://github.com/ezyang, https://github.com/huydhn
2024-01-07 01:37:26 +00:00
2dca3e99eb Revert "Support tensors as Dict keys Re-PR of #111196 (#116785)"
This reverts commit 1badad9ce9694ef70f6a3dc01000f2cf310c4c11.

Reverted https://github.com/pytorch/pytorch/pull/116785 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/116785#issuecomment-1879592261))
2024-01-06 08:22:33 +00:00
1badad9ce9 Support tensors as Dict keys Re-PR of #111196 (#116785)
This prepares the PR where we implement sets in terms of dicts.
To do so, rather than storing internally a dictionary that maps literals
to VariableTrackers, it stores (pretty much) a dictionary from VTs to VTs.
To do so, keys are wrapped in an opaque internal class _Hashable.
The Hashable class is opaque on purpose so that it fails hard if
if it inadvertently leaks back into user code.
We also found and fixed a number of latent bugs and inconsistencies
in the way dynamo checked what can be a dict key. More generally, we
make much clearer what are the things that need to be modified to add
a new supported key type to Dicts.

Fixes [#107595](https://www.internalfb.com/tasks?t=107595)
Fixes [#111603](https://www.internalfb.com/tasks?t=111603)

Re-PR of https://github.com/pytorch/pytorch/pull/111196 sadly due to reverts, we could not reuse @lezcano's original PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116785
Approved by: https://github.com/mlazos
2024-01-06 03:35:35 +00:00
7c8f38700a [dynamo] Fix np.issubdtype (#116459)
Fixes the issue described at https://github.com/pytorch/pytorch/issues/93697#issuecomment-1828346590

This doesn't fix the full issue yet, now we hit
```python
  File
  "/home/lezcano/git/pytorch/pytorch/torch/_dynamo/symbolic_convert.py",
  line 744, in step
  getattr(self, inst.opname)(inst)
  File
  "/home/lezcano/git/pytorch/pytorch/torch/_dynamo/symbolic_convert.py",
  line 1366, in BUILD_MAP
      assert (
      AssertionError
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116459
Approved by: https://github.com/peterbell10
2024-01-05 01:48:07 +00:00
75dae4f691 Revert "[dynamo] Fix np.issubdtype (#116459)"
This reverts commit b5c33ccdb3198a48a354e21a4fdace0ec6d04146.

Reverted https://github.com/pytorch/pytorch/pull/116459 on behalf of https://github.com/zou3519 due to Broke CI, seems to be a landrace ([comment](https://github.com/pytorch/pytorch/pull/116459#issuecomment-1877135999))
2024-01-04 14:00:11 +00:00
b5c33ccdb3 [dynamo] Fix np.issubdtype (#116459)
Fixes the issue described at https://github.com/pytorch/pytorch/issues/93697#issuecomment-1828346590

This doesn't fix the full issue yet, now we hit
```python
  File
  "/home/lezcano/git/pytorch/pytorch/torch/_dynamo/symbolic_convert.py",
  line 744, in step
  getattr(self, inst.opname)(inst)
  File
  "/home/lezcano/git/pytorch/pytorch/torch/_dynamo/symbolic_convert.py",
  line 1366, in BUILD_MAP
      assert (
      AssertionError
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116459
Approved by: https://github.com/peterbell10
2024-01-04 03:55:50 +00:00
33917150d3 Cleanup scope ref properly (#116169)
Fixes https://github.com/pytorch/pytorch/issues/116143

See test in PR for a case where this happens. Discovered while debugging optimizers.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116169
Approved by: https://github.com/janeyx99, https://github.com/williamwen42, https://github.com/jansel
2023-12-28 23:29:37 +00:00
7e12e722af [Dynamo][12/N] Remove allowed_functions.py (#116401)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116401
Approved by: https://github.com/angelayi
2023-12-28 21:26:06 +00:00
f1cdb39da3 [dynamo] Fix handling of one_hot (#116338)
Fixes #115817

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116338
Approved by: https://github.com/yanboliang
2023-12-24 04:55:35 +00:00