50 Commits

Author SHA1 Message Date
4ab847bbc7 Pyrefly suppressions 4/n (#164615)
Adds suppressions to pyrefly will typecheck clean: https://github.com/pytorch/pytorch/issues/163283

Test plan:
dmypy restart && python3 scripts/lintrunner.py -a
pyrefly check

step 1: uncomment lines in the pyrefly.toml file
step 2: run pyrefly check
step 3: add suppressions, clean up unused suppressions
before: https://gist.github.com/maggiemoss/356645cf8cfe33123d9a27f23b30f7b1

after:

0 errors (2,753 ignored)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164615
Approved by: https://github.com/oulgen
2025-10-06 16:14:36 +00:00
35c4130fd1 [2/N] Fix ruff warnings (#164460)
Apply ruff `SIM` rules.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164460
Approved by: https://github.com/ezyang
2025-10-04 03:40:32 +00:00
a43c4c3972 [5/N] Apply ruff UP035 rule (#164423)
Continued code migration to enable ruff `UP035`. Most changes are about moving `Callable` from `typing` to `from collections.abc`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164423
Approved by: https://github.com/ezyang
2025-10-02 07:31:11 +00:00
a293206bd5 Fix invalid f-strings (#164112)
Fixes invalid f-strings detected by `ruff`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164112
Approved by: https://github.com/Skylion007, https://github.com/mlazos
2025-09-30 04:17:13 +00:00
8f6dbc0ba8 [scan] create fw and bw graphs via partitioning (#162754)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162754
Approved by: https://github.com/zou3519
ghstack dependencies: #161557, #161664, #161808, #162025, #161732
2025-09-27 18:13:15 +00:00
3413490f53 [scan] materialize combine_fn in forward add more autograd tests (#161732)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/161732
Approved by: https://github.com/zou3519
ghstack dependencies: #161557, #161664, #161808, #162025
2025-09-27 18:13:15 +00:00
b85bee3bbb [hop] refactor check input alias and mutation to be a graph pass (#162025)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162025
Approved by: https://github.com/zou3519
ghstack dependencies: #161557, #161664, #161808
2025-09-27 18:13:15 +00:00
f5d85874dd [scan][be] remove unnecessary tensor checks (#161664)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/161664
Approved by: https://github.com/Skylion007, https://github.com/zou3519
ghstack dependencies: #161557
2025-09-27 18:13:14 +00:00
07f07309c6 [associative_scan] Autograd separated (#139939)
This PR implements the Autograd feature of the associative_scan.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139939
Approved by: https://github.com/huydhn
2025-09-08 23:30:11 +00:00
5d819f3faf Revert "[associative_scan] Autograd separated (#139939)"
This reverts commit 103f725afa8dbf0204a1be6a042ab93aa16d85d8.

Reverted https://github.com/pytorch/pytorch/pull/139939 on behalf of https://github.com/huydhn due to Sorry for reverting your change but I am seeing a weird failure after this lands in trunk ([comment](https://github.com/pytorch/pytorch/pull/139939#issuecomment-3267945657))
2025-09-08 20:42:47 +00:00
103f725afa [associative_scan] Autograd separated (#139939)
This PR implements the Autograd feature of the associative_scan.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139939
Approved by: https://github.com/ydwu4
2025-09-08 03:21:17 +00:00
cb9e2092a8 [scan] support gen_schema for scan (#158864)
We don't want to allow scan's combine_fn to mutate its inputs. The semantic of the mutation can be confusing. For example:
```python
def combine_fn(init, x):
```
If combine_fn mutates init, only first iteration mutates init, the rest of the iterations mutates the previous carry, which is an intermediate result. This is kind of a weird semantic because the only observable mutation is for init, which can be done outside of the combine_fn.

If combine_fn mutates x, where x is a slice of scanned inputs (i.e. xs), this pattern is more meaningful but we've not seen any use case yet.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158864
Approved by: https://github.com/zou3519
ghstack dependencies: #154193, #158965, #158863
2025-08-15 17:28:44 +00:00
64dc30c213 [HOP, map] Rework of map autograd to the new interface (#153343)
This PR reworks the current autograd implementation of map to the new interface.

@pytorchbot label "topic: not user facing"

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153343
Approved by: https://github.com/ydwu4
2025-08-06 23:02:42 +00:00
5d93127c87 Revert "[HOP, map] Rework of map autograd to the new interface (#153343)"
This reverts commit 24b1f10ca13d682430725c511812e43a35fcd6a6.

Reverted https://github.com/pytorch/pytorch/pull/153343 on behalf of https://github.com/yangw-dev due to a older pr this pr dependes on needed to revert, rebase it after it's in ([comment](https://github.com/pytorch/pytorch/pull/153343#issuecomment-3133816812))
2025-07-29 19:46:42 +00:00
cyy
f6c89c1ef3 Detach tensor before clone in SGD optimiser and other code (#159204)
Reverse the pattern of tensor clone followed by detach in SGD and other code.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159204
Approved by: https://github.com/Skylion007
2025-07-27 03:31:12 +00:00
24b1f10ca1 [HOP, map] Rework of map autograd to the new interface (#153343)
This PR reworks the current autograd implementation of map to the new interface.

@pytorchbot label "topic: not user facing"

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153343
Approved by: https://github.com/ydwu4
2025-07-25 21:17:06 +00:00
7f14b42adf [BE][2/16] fix typos in torch/ (torch/_*/) (#156312)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156312
Approved by: https://github.com/albanD
2025-07-12 05:47:06 +00:00
e15f4248ad Revert "[BE][2/16] fix typos in torch/ (torch/_*/) (#156312)"
This reverts commit 7a92b5119654c07d15f5c0818e6ae804b01e836c.

Reverted https://github.com/pytorch/pytorch/pull/156312 on behalf of https://github.com/XuehaiPan due to landrace ([comment](https://github.com/pytorch/pytorch/pull/156312#issuecomment-3064672250))
2025-07-12 04:40:52 +00:00
7a92b51196 [BE][2/16] fix typos in torch/ (torch/_*/) (#156312)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156312
Approved by: https://github.com/albanD
2025-07-12 01:47:22 +00:00
162ca185ff [BE][PYFMT] migrate PYFMT for torch/_[a-h]*/ to ruff format (#144551)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144551
Approved by: https://github.com/ezyang
ghstack dependencies: #148186
2025-06-25 06:16:06 +00:00
fac0cc16ef [scan] fix doc of scan and list the restrctions. (#155577)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155577
Approved by: https://github.com/zou3519
2025-06-12 18:22:28 +00:00
b084e1b81c [HOP] Rework Autograd DispatchKey for scan and map (#153336)
This PR introduces the `py_autograd_impl` instead of the `DispatchKey.Autograd` for some HOPs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153336
Approved by: https://github.com/ydwu4
2025-06-04 20:54:02 +00:00
faf973da5e [refactor] move materialize_as_graph to _higher_order_ops/utils.py (#154070)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/154070
Approved by: https://github.com/zou3519
2025-05-31 00:06:44 +00:00
68034198e5 [HOP] Mutation and alias rework (#146658)
This PR reworks the way the input mutations and various aliases are checked

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146658
Approved by: https://github.com/ydwu4
2025-05-18 08:05:22 +00:00
ea12a38668 [associative_scan] Refactoring of input checking and dynamo invocation (#148657)
This PR is the counterpart of https://github.com/pytorch/pytorch/pull/142125 for the associative_scan operation. The way the input checks are performed and the combine_fn is not invoked in the frontend to check the output trees, but rather dynamo is used for that.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148657
Approved by: https://github.com/ydwu4
2025-05-02 21:39:28 +00:00
a2632d5241 [HOP] Reworked DispatchKey.Autograd (#151107)
This PR intends to rework the dispatching of the autograd key.
I.e., currently the DispatchKey.Autograd of the HOPs was triggered, even if non of the operands of the HOP have `requires_grad=True`. With this rework, the autograd is bypassed if non of the operands require gradients and only invoked if any of the operands require gradients.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151107
Approved by: https://github.com/ydwu4
2025-04-15 19:55:46 +00:00
b0bdd76f2e [scan] Autograd with partial gradient support (#146285)
This PR introduces the Autograd feature for scan with partial gradient support. It is a combination of the already opened PRs: https://github.com/pytorch/pytorch/pull/135631 and https://github.com/bohnstingl/pytorch/pull/4

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146285
Approved by: https://github.com/ydwu4

Co-authored-by: Yidi Wu <yidi@meta.com>
2025-04-14 17:01:31 +00:00
cd5c13d8f0 [hop] Rework the check of Metadata in the functionalization key (#148789)
This PR is a more cosmetic rework of the metadata check performed by some HOPs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148789
Approved by: https://github.com/ydwu4
2025-03-18 20:30:59 +00:00
23441492f6 [scan] Refactoring of input checking and dynamo invocation (#142125)
This PR does a refactoring of the way dynamo is invoked and how the input shapes are checked for scan and for associative_scan

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142125
Approved by: https://github.com/ydwu4
2025-03-06 01:06:54 +00:00
e4c558be1d [scan] Corrections for scan (#146110)
This PR resolves some minor issues with the scan HOP and unifies the handling of the additional_inputs in the same way as for associative_scan.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146110
Approved by: https://github.com/ydwu4
2025-03-04 20:29:08 +00:00
7c71ab1d40 [scan] User-facing reverse flag handling (#147886)
This PR removes the reverse flag from the backend implementation and resolves it via `torch.flip` in the frontend.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147886
Approved by: https://github.com/ydwu4
2025-02-26 20:04:57 +00:00
9f9904172d [scan] scan dim handling in user-facing scan() (#145179)
This PR introduces the capability that the scan dim is handled in the user facing scan() call. Internally, the scan dim is always shifted to dim 0 and then the scan is performed over that dim.

This is a follow-up PR from https://github.com/bohnstingl/pytorch/pull/3

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145179
Approved by: https://github.com/ydwu4
2025-01-30 21:09:07 +00:00
3c247ee8c4 [hop][be] add utils for more comprehensive input alias and mutation (#145298)
This PR implements  the idea of checking input mutations through tensor version and check aliasing via storage  from @zou3519. Previously, we rely on whether there's a in place op that takes placeholder input, which doesn't take views into account.

When writing the PR, I also noticed a bug in previous input mutation checking logic: we were checking the whether there are operators functionalized_f where all the mutating ops have been replaced so we won't be able to detect any thing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145298
Approved by: https://github.com/zou3519
2025-01-23 18:12:28 +00:00
805c4b597a PEP585 update - torch/_higher_order_ops torch/_subclasses torch/backends torch/compiler torch/cuda torch/masked torch/mtia torch/nested (#145202)
See #145101 for details.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145202
Approved by: https://github.com/bobrenjc93
2025-01-20 22:37:26 +00:00
dc23f1944a Remove unused Python variables in torch/[_-a]* (#133492)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133492
Approved by: https://github.com/albanD
2024-12-12 17:39:14 +00:00
5c97ac9721 Revert "Remove unused Python variables in torch/[_-a]* (#133492)"
This reverts commit fda975a7b3071a20dab8fc2c4e453479e1bb7cf2.

Reverted https://github.com/pytorch/pytorch/pull/133492 on behalf of https://github.com/clee2000 due to Sorry, I need to revert this in order to revert something else.  The only thing you need to do is rebase and remerge ([comment](https://github.com/pytorch/pytorch/pull/133492#issuecomment-2536635516))
2024-12-11 17:29:12 +00:00
fda975a7b3 Remove unused Python variables in torch/[_-a]* (#133492)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133492
Approved by: https://github.com/albanD
2024-12-10 21:48:44 +00:00
ab42967238 [hop free symbols] lift free symbols in example_value when create_graph_input (#138363)
There are 4 parts (they are hard to further break into smaller ones cause they're highly coupled) in this PR:
1. **Whenever we call create_graph_input, we try to bind the symbols in the graph input.**
We've enforced the invariant that all create_graph_inputs calls must provide an example value, we could intercept at the create_graph_input calls (This PR only handles free symbols in tensors).
2. **We cache the bound_symbols** to avoid lift the same symbol repeated.
3. For lifted symbols, we re-used  **lifted_freevars** i.e. the mapping between symbol proxy in parent graph to the lifted phs in current subgraph, which we handle lifted tensors. In this way, all hops that supports lifted tensors should be able to handle lifted_symints automatically (at least in dynamo part).
4. For **unbacked symbols** created during tracing, we need to also bound these symbols to its proxy. This is to support the tests cases where we want to lift unbacked symbols as input. We need the proxy of the unbacked symbol in parent graph in order to properly create the args to the hop.
5. We change all the tests after free symbols are lifted in subgraphs. And also supports the lifted symbols in existing higher order ops.

**The interaction of nested tracers:**
The previous design for lifting tensor closures is that: suppose we're in nested tracers, whenever we see a new proxy that's not created by create tracer, we recursively look for the proxy in parent tracer until we find the tracer that creates this proxy (either a placeholder or some intermediate results). More detail is in Note [Nested SubgraphTracer and free_variable handling].

Given the above design, the plan for lifting the free symbols is: whenever we lift a free tensor to be the inputs of current subgraph, we'll look at the symbols in it and bind the symbols at the same time.

For example, suppose we have the following function:
```python
def f(x: [s1, s2]):
  def true_f():
    def true_f_inner():
      return x.sin()
```
what will happen in time order:

1. we create a subtracer 1 and start to speculate the outer cond's true_f
2. we create a another subtracer 2 and start to speculate the inner cond's true_f_inner.
3. dynamo realize the tensor input x by calling wrap_tensor in top-level to create graph input x (tracer 0), we bind the symbol s1, s2 after ph for x is created. So the graph now looks like:
```python
def gm(s1, s2, x):
```
4. when seeing TensorVariable.call_method of x,  tracer2 wants to create a call_function(sin, proxy_of_x), but it finds that proxy_of_x is not created by current tracer. So it recursively look up its parent tracer1 and find parent tracer1 also doesn't track this proxy_of_x then it finds the root tracer0, who is the creator of it and tracks it as a ph. Then tracer 1 create_graph_input  to lift the closure to its input ph1 and add (proxy_of_x: ph1) k-v in **lifted_freevars**  of tracer 1.
Now the graph looks like:
```python
def gm(s1, s2, x):
  def true_gm(x):
```
5. Since there are free symbols inside this new tensor input, tracer 1 also binds the symbols (maybe_bind_symbol), which calls create_graph_input for s1 and s2. Now the graph looks like
```python
def gm(s1, s2, x):
  def true_gm(s1, s2, x):
```
6. then it goes back to tracer 2, and call create_graph_input for x and get ph2, tracer 2's **lifted_freevars** records (ph1, ph2). and tracer 2 also binds the symbols in this new tensor input. Now the graph looks like:
```python
def gm(s1, s2, x):
  def true_gm(s1, s2, x):
    def true_gm_inner(s1, s2, x):
```
7. Finally the sin call_function node is created by tracer 2.

**This PR also handles the following cases:**
- What if we lift two tensors share the same symbol? e.g. x1 [s1, s2], x2 [s2, s3]? Each subtracer maintains bound_symbols as a cache that maps a symbol.expr to its proxy in current tracer. So when we see x1, we'll track s1 and s2 as inputs and bound s1 to ph1, s2 to ph2. So when we try to bind symbols of x2, s2 will already be tracked so no graph input is created.
- what if a subgraph close over a symint? e.g.
```python
def f(x):
  def true_f():
    c = x.size(0)
   def true_fn_inner():
     return c
```
When we speculate true_fn_inner, we find proxy_of_c is not tracked by tracer 2, so it recursively looks up its parent. At this point, x and its symbols have been lifted as input of true_f (as a result of lifting x during tracing true_f in tracer 1. Specifically the graph looks like:
```python
def gm(s1, s2, x):
  def true_gm(s1, s2, x):
    def true_gm_inner():
```
So tracer 2 is able to find that s1 have been tracked as ph in tracer 1 so it returns back to gm and call create_graph_input on s1. The graph now looks like:
```python
def gm(s1, s2, x):
  def true_gm(s1, s2, x):
    def true_gm_inner(s1):
     return s1
```

-  What if subgraph close over an unbacked symint? e.g.
```python
def f(x):
  def true_f():
    c =  x.item()
    def true_f_inner():
      return c
```
When x.item() is called, proxy_of_c and its symnode variable is created for tracer 1, and we also call track_unbacked_symbols to record this relationship. So when tracer 2 finds proxy_of_c is not created by current tracer, it recursivelly looks up its parent tracer and finds that that expression u0 has been tracked as a result of track_unbacked_symbol in tracer 1. So it will stop the recursion and create_graph_input u0 in tracer 2. Graph looks like:
```python
def f(x):
  def true_f(s1, s2, x):
    c = x.item()
    def true_gm_inner(u0):
      return u0
    cond(pred, true_gm_inner, false_gm_inner, (c,))
```

- what if subgraph close over a tensor with unbacked symint shape?
```python
def f(x):
  def true_f():
    c = x.item()
    r = torch.randn((c,))
    def true_f_inner():
      return r + 1
```
This is the same as the case of closing over tensors with backed shapes. where we first lift r, then bind u0 in it, which recursively bind_symint of u0 in its parent and found u0 is tracked in parent tracer as a result of .item() call.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138363
Approved by: https://github.com/zou3519
2024-11-07 04:44:32 +00:00
1d28b8b6d5 Revert "Deprecate torch._utils.is_compiling() and torch._dynamo.external_utils.is_compiling() (#127690)"
This reverts commit e84d1121ad66a453c8c24fcc098625e2e9764fca.

Reverted https://github.com/pytorch/pytorch/pull/127690 on behalf of https://github.com/ZainRizvi due to Sorry but this is breaking internally. More details in D65483292 ([comment](https://github.com/pytorch/pytorch/pull/127690#issuecomment-2458381056))
2024-11-05 23:10:38 +00:00
e84d1121ad Deprecate torch._utils.is_compiling() and torch._dynamo.external_utils.is_compiling() (#127690)
This PR is split from PR #126898.

- #126898

------

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127690
Approved by: https://github.com/Skylion007, https://github.com/malfet
2024-11-05 10:44:56 +00:00
c6bb9b53f4 [scan] better error handling and remove redundant tests (#137967)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/137967
Approved by: https://github.com/zou3519
2024-10-25 19:01:25 +00:00
0bfa1bf21d [scan] support closure (#135602)
This PR adds an additional_inputs argument to support closures similar to what we've done for while_loop.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135602
Approved by: https://github.com/zou3519
ghstack dependencies: #135600, #135601
2024-10-16 22:28:03 +00:00
819d6b139c [scan] flatten subgraph output and make subgraph inputs to be a slice (#135601)
This pr introduces two changes:
1. Before this pr, the subgraphs output is ([], []), in this pr, we change it to a flattened list for easier codegen and consistency with other control flow operators.

2. Before the PR, the combine_fn of scan takes a sliced input but keep the sliced dimension. For exmaple, suppose xs = torch.randn(3, 4, 5) and we scan over dim 0, the combine_fn looks like:
```
# x.shape = (1, 4, 5) instead of (4, 5)
def combine_fn(carry, x):
  ...
```

In this PR, we fixed this and also simplify some of the slicing logic.

3. this diff also make sure we always stack ys on fist dimension.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135601
Approved by: https://github.com/zou3519
ghstack dependencies: #135600
2024-10-16 22:28:03 +00:00
0437a22d43 [scan] fix typo in signature and remove wrapper (#135600)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135600
Approved by: https://github.com/zou3519
2024-10-16 22:27:59 +00:00
228760b945 [Dynamo] Use custom backend to reenter metadata tf mode when tracing while/cond (#134732)
For tracing cond/while in eager, we trace the HOP with the eager backend with metadata torchfunction mode enabled. HOPs disallow the mutation that occurs in this torch function mode, so it is not able to be traced. As a result, we use a custom backend which enters this mode for tracing these HOPs. Thanks to @ydwu4 for the help with implementing this

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134732
Approved by: https://github.com/ydwu4
2024-09-14 18:52:22 +00:00
23dec79cef Revert "[Dynamo] Use custom backend to reenter metadata tf mode when tracing while/cond (#134732)"
This reverts commit 731b178b56c83966d6e8cdfb0015d22d8f91b4d2.

Reverted https://github.com/pytorch/pytorch/pull/134732 on behalf of https://github.com/mlazos due to broke python test/quantization/pt2e/test_numeric_debugger.py TestNumericDebugger.test_re_export_preserve_handle modified yesterday ([comment](https://github.com/pytorch/pytorch/pull/134732#issuecomment-2350937008))
2024-09-14 10:02:55 +00:00
731b178b56 [Dynamo] Use custom backend to reenter metadata tf mode when tracing while/cond (#134732)
For tracing cond/while in eager, we trace the HOP with the eager backend with metadata torchfunction mode enabled. HOPs disallow the mutation that occurs in this torch function mode, so it is not able to be traced. As a result, we use a custom backend which enters this mode for tracing these HOPs. Thanks to @ydwu4 for the help with implementing this

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134732
Approved by: https://github.com/ydwu4
2024-09-14 02:40:32 +00:00
7ed0563cad Revert "[Dynamo] Use custom backend to reenter metadata tf mode when tracing while/cond (#134732)"
This reverts commit e504fb70693d4a3741c3380b6a989d441e84f737.

Reverted https://github.com/pytorch/pytorch/pull/134732 on behalf of https://github.com/albanD due to Broke tests on main ([comment](https://github.com/pytorch/pytorch/pull/134732#issuecomment-2348886378))
2024-09-13 12:52:58 +00:00
e504fb7069 [Dynamo] Use custom backend to reenter metadata tf mode when tracing while/cond (#134732)
For tracing cond/while in eager, we trace the HOP with the eager backend with metadata torchfunction mode enabled. HOPs disallow the mutation that occurs in this torch function mode, so it is not able to be traced. As a result, we use a custom backend which enters this mode for tracing these HOPs. Thanks to @ydwu4 for the help with implementing this

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134732
Approved by: https://github.com/ydwu4
2024-09-13 08:40:50 +00:00
e889252493 Implementation of scan (#134102)
This operation is supposed to be the pendant to the `associative_scan`, but can operate with non-associative functions.

@ydwu4

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134102
Approved by: https://github.com/ydwu4
2024-09-10 04:51:16 +00:00