Commit Graph

793 Commits

Author SHA1 Message Date
f0078941cf Revert "[RELAND] Always build USE_DISTRIBUTED (#160449) and Make distributed modules importable even when backend not built (#159889) (#162594)"
This reverts commit 6c334885d48725197b5d35e2c1543efc0f4198d0.

Reverted https://github.com/pytorch/pytorch/pull/162594 on behalf of https://github.com/wdvr due to reverted internally - @ezyang see D82281294 ([comment](https://github.com/pytorch/pytorch/pull/162594#issuecomment-3317017530))
2025-09-22 05:39:07 +00:00
0b59492853 [export] Fix wrap_with_set_grad_enabled retracing (#163295)
Fixes https://github.com/pytorch/pytorch/issues/163294

The code `with torch.set_grad_enabled(enable_grad)` calls `torch._C._set_grad_enabled` three times -- (1) when [initializing set_grad_enabled](bb7c9a2d41/torch/autograd/grad_mode.py (L187C9-L187C35)), (2) when [entering the context](bb7c9a2d41/torch/autograd/grad_mode.py (L194)), and (3) when [exiting the context](bb7c9a2d41/torch/autograd/grad_mode.py (L197)).

This results in the the retraced export module to have a duplicate `torch._C._set_grad_enabled` like:
```
def forward(self, arg0_1):
    add = torch.ops.aten.add.Tensor(arg0_1, 1);  arg0_1 = None
    _set_grad_enabled = torch._C._set_grad_enabled(False);  _set_grad_enabled = None
    _set_grad_enabled = torch._C._set_grad_enabled(False);  _set_grad_enabled = None
    add_1 = torch.ops.aten.add.Tensor(add, 2);  add = None
    _set_grad_enabled_1 = torch._C._set_grad_enabled(True);  _set_grad_enabled_1 = None
    add_2 = torch.ops.aten.add.Tensor(add_1, 3);  add_1 = None
    return (add_2,)
```

When export runs the `replace_set_grad_with_hop_pass`, it will look through the graph for `torch._C._set_grad_enabled` and create subgraphs. The duplicate `torch._C._set_grad_enabled` results in an empty submod in the graph, which resulted in an error in [this post](https://fb.workplace.com/groups/1028545332188949/posts/1844720036398281/?comment_id=1862175381319413).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163295
Approved by: https://github.com/yushangdi
2025-09-21 22:54:40 +00:00
d70c0babf5 minimize graph capture output (#162211)
Currently OutputGraphGuardsState is separated out as a serializable interface for OutputGraph, but some of the typing around it is incorrect in dynamo's guards.py and output_graph.py: more fields are used by code than claimed by OutputGraphGuardsState, and it works because either the full OutputGraph is passed in or the parts that use those fields are dead when OutputGraphGuardsState is passed in.
In this PR we try to further separate the necessary fields of OutputGraph that should be retained by a full graph capture mechanism, not just limited to dynamo (as it is currently) but also something like make_fx (in the future). Since these fields do not need to be serialized, the result is an intermediate "common" data structure that is between OutputGraphGuardsState and OutputGraph in the inheritance hierarchy.

Differential Revision: D81718791

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162211
Approved by: https://github.com/zhxchen17
2025-09-20 15:52:28 +00:00
56893ca1f6 Don't register wrong overload to prim decomp (#163138)
These decompositions take precedence before CIA decomps in fake tensor prop, as a result, we would hit this implementation for all where overloads which is wrong in some cases. For the overloads that can't be implemented by this decomp, we just run the default CIA impl. Previously this doesn't matter because in post-dispatch IR, aten.where would have decomposed but when user tries to preserve aten.where this issue will surface because fake tensor will start seeing aten.where.

Differential Revision: [D82604702](https://our.internmc.facebook.com/intern/diff/D82604702)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163138
Approved by: https://github.com/henryoier, https://github.com/ezyang
2025-09-18 17:01:19 +00:00
0e9f9c3a61 Fix inconsistent test and add new tracer as config (#162558)
It is better to have the new tracer as global config that can be manipulated easily. Also I believe dynamo-like config infra is useful instead of relying on custom way of patching stuff.

Differential Revision: [D82478649](https://our.internmc.facebook.com/intern/diff/D82478649)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162558
Approved by: https://github.com/zhxchen17
ghstack dependencies: #162557
2025-09-17 17:01:48 +00:00
e28983be76 Add decomp rule to assert_tensor_metadata for BatchedTensors (#163008)
Whenever there is device move, export introduces assert_tensor_metadata aten operator to make sure to guard for device specialization. This aten op didn't work with Vmap because we didn't register explicit decomp rule saying we just skip BatchedTensor and call it on underlying tensor

Differential Revision: [D82483979](https://our.internmc.facebook.com/intern/diff/D82483979)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163008
Approved by: https://github.com/huydhn
2025-09-17 03:49:41 +00:00
489860f3c2 Prefer_deferred_runtime_asserts should be propagated to new tracer (#162556)
Differential Revision: [D82478650](https://our.internmc.facebook.com/intern/diff/D82478650)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162556
Approved by: https://github.com/avikchaudhuri
ghstack dependencies: #162487
2025-09-16 21:25:00 +00:00
a4e74f416b Fix error message (#162487)
More proper fix here should be that we directly replace shape_env with correct sources but it is bit involved as we have to manually construct dynamo sources by hand (need to handle list/dict etc) but it is quite easy if we are operating on a string so i do this as post-processing step for now.

Differential Revision: [D82478647](https://our.internmc.facebook.com/intern/diff/D82478647)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162487
Approved by: https://github.com/zhxchen17
2025-09-16 19:06:30 +00:00
48dbd60df4 are_strides_like_channels_last_or_false (#162354)
Note this could change suggest_memory_format behaviour for unbacked

we used to return True for are_strides_like_channels_last sometimes even when results undecided
now when its not decided we return False.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162354
Approved by: https://github.com/aorenste
2025-09-16 00:49:05 +00:00
543d50db2b Fix torch export with dict input nested in args (#162618)
Investigated together with @pyemma and @taotaohuang001

## Problem
when calling exported module with dict nested in the args tuple, it will make following complaits
```
Traceback (most recent call last):
  File "/home/chzhu/infinitrain/test_torch_export.py", line 32, in <module>
    print(exported_model({"a2": torch.randn(10), "a1": torch.randn(10)}))
  File "/home/chzhu/infinitrain/build/infinitrain/environments/development-venv/lib/python3.10/site-packages/torch/fx/graph_module.py", line 848, in call_wrapped
    return self._wrapped_call(self, *args, **kwargs)
  File "/home/chzhu/infinitrain/build/infinitrain/environments/development-venv/lib/python3.10/site-packages/torch/fx/graph_module.py", line 424, in __call__
    raise e
  File "/home/chzhu/infinitrain/build/infinitrain/environments/development-venv/lib/python3.10/site-packages/torch/fx/graph_module.py", line 411, in __call__
    return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
  File "/home/chzhu/infinitrain/build/infinitrain/environments/development-venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/chzhu/infinitrain/build/infinitrain/environments/development-venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1879, in _call_impl
    return inner()
  File "/home/chzhu/infinitrain/build/infinitrain/environments/development-venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1806, in inner
    args_kwargs_result = hook(self, args, kwargs)  # type: ignore[misc]
  File "/home/chzhu/infinitrain/build/infinitrain/environments/development-venv/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 929, in _fn
    return fn(*args, **kwargs)
  File "/home/chzhu/infinitrain/build/infinitrain/environments/development-venv/lib/python3.10/site-packages/torch/export/_unlift.py", line 81, in _check_input_constraints_pre_hook
    flat_args_with_path = _check_inputs_match(args, kwargs, self._in_spec)
  File "/home/chzhu/infinitrain/build/infinitrain/environments/development-venv/lib/python3.10/site-packages/torch/export/_unlift.py", line 64, in _check_inputs_match
    raise ValueError(  # noqa: B904
ValueError: Trying to flatten user inputs with exported input tree spec:
TreeSpec(tuple, None, [TreeSpec(tuple, None, [TreeSpec(dict, ['a1', 'a2'], [*,
      *])]),
  TreeSpec(dict, [], [])])
but actually got inputs with tree spec of:
TreeSpec(tuple, None, [TreeSpec(tuple, None, [TreeSpec(dict, ['a2', 'a1'], [*,
      *])]),
  TreeSpec(dict, [], [])]).
Please check that the inputs have the same number and type of args and kwargs as the ones you used when tracing.

```

## How to reproduce the issue
```python
import torch

# create a nn.Module with data_batch as input and output as output
class MyModel(torch.nn.Module):
   def __init__(self):
       super(MyModel, self).__init__()
       self.linear = torch.nn.Linear(10, 1)

   def forward(self, data_batch):
       h1 = self.linear(data_batch["a1"])
       h2 = self.linear(data_batch["a2"])
       return h1 + h2

# torch export this module
model = MyModel()
example_args_forward = (
   {
       "a1": torch.randn(10),
       "a2": torch.randn(10),
   },
)
exported_model = torch.export.export(model, example_args_forward, strict=True)

# save the exported model
torch.export.save(exported_model, "exported_model.pt2")

# load the exported model
exported_model = torch.export.load("exported_model.pt2").module()

# run the exported model
print(exported_model({"a2": torch.randn(10), "a1": torch.randn(10)}))

```

## Root Cause
Input spec is encoded as [TreeSpec](582d278983/torch/utils/_pytree.py (L1059)) in torch export. With (args, kwargs) at the top level. When we call the exported model, it has a pre-execution [hook](582d278983/torch/export/_unlift.py (L66)) to check the input TreeSpec matches the received TreeSpec, where in Treespec, the dict key order is preserved. Something like

TreeSpec(dict, ['a2', 'a1'], [*,*])

To workaround this, the input check reorders [kwargs](582d278983/torch/export/_unlift.py (L67)), that is why kwargs can be out of order. But the dict nested in the args is not re-ordered, so any re-ordering of the keys will throw errors.

## Solution
Update eq_spec to handle the dict case, where we only guarantee that key set is the same without ordering constraints.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162618
Approved by: https://github.com/angelayi
2025-09-13 03:24:30 +00:00
6c334885d4 [RELAND] Always build USE_DISTRIBUTED (#160449) and Make distributed modules importable even when backend not built (#159889) (#162594)
Summary:
Original: D81957844 and D81957923

Also, https://github.com/pytorch/pytorch/pull/162142 is patched in as well

#buildall

Test Plan:
sandcastle and oss ci

Rollback Plan:

Reviewed By: H-Huang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162594
Approved by: https://github.com/H-Huang, https://github.com/dcci
2025-09-12 10:54:42 +00:00
6b59a19242 Revert "[RELAND] Always build USE_DISTRIBUTED (#160449) and Make distributed modules importable even when backend not built (#159889) (#162594)"
This reverts commit 6e8f17c58029e5fa6bc222b2445ebbc0cbdc17c7.

Reverted https://github.com/pytorch/pytorch/pull/162594 on behalf of https://github.com/huydhn due to Reverted internally ([comment](https://github.com/pytorch/pytorch/pull/162594#issuecomment-3283985880))
2025-09-12 06:52:03 +00:00
6e8f17c580 [RELAND] Always build USE_DISTRIBUTED (#160449) and Make distributed modules importable even when backend not built (#159889) (#162594)
Summary:
Original: D81957844 and D81957923

Also, https://github.com/pytorch/pytorch/pull/162142 is patched in as well

#buildall

Test Plan:
sandcastle and oss ci

Rollback Plan:

Reviewed By: H-Huang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162594
Approved by: https://github.com/H-Huang, https://github.com/dcci
2025-09-12 03:56:18 +00:00
501e19137a fix var args for shape guards (#162633)
Summary: Fixes #162599

Test Plan:
added test based on repro

Rollback Plan:

Differential Revision: D82144520

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162633
Approved by: https://github.com/tugsbayasgalan
2025-09-12 00:33:35 +00:00
463fbc8ca0 Support vmap + custom autograd function/improve DTensor constructor inefficiency (#162240)
This makes gemma3 exportable on transformers=4.55.4

In HF, there is a torch funciton mode called TransformGetItemToIndex which internally calls custom autograd function. When this custom autograd function is called under vmap, It triggers CustomFunctionHigherOrderOP which error-ed because there was no pre-dispatch proxy mode implementation.

Since there are number of requests lately to add various operators in pre-dispatch IR, I introduce a decorator in export that works similar to `allow_in_graph`. Basically:
1) We intercept custom_autograd_function.apply at pre-dispatch mode when this decorator is applied
2) We apply `flat_apply` HOP to hide the pytree spec for this autograd function. Note that this adds restriction that this custom autograd function needs to take in fx-able types.
3) subclass constructor decorator is implemented similarly, so we just refactor it to use similar implementation as this new decorator. eventually we should delete the subclass constructor decorator.
4) Move some code in subclass constructor decorator to exit early in non-export environment which should shave off some inefficiency (around 1% according to @swolchok 's benchmark)

Fixes: https://github.com/pytorch/pytorch/issues/161563#issuecomment-3246309758

Differential Revision: [D82141316](https://our.internmc.facebook.com/intern/diff/D82141316)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162240
Approved by: https://github.com/ydwu4
2025-09-11 17:42:41 +00:00
fccddf02b6 repro 161902 (#162416)
Summary:
Sometimes `ShapeEnv.create_symbol` can return a `sympy.Integer`. This messes up our phantom symbol infra for derived dims.

Fixes #161902

Test Plan:
added test based on repro

Rollback Plan:

Differential Revision: D81960709

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162416
Approved by: https://github.com/tugsbayasgalan
2025-09-11 16:35:23 +00:00
c924c675d0 Fix persistent buffer bug (#162190)
For non-persistent buffers, we should properly register them.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162190
Approved by: https://github.com/zhxchen17
2025-09-11 14:56:26 +00:00
6d65737aee testing infra and some fixes (#162183)
This PR is quite large in that it covers most of rough edges in the new strict export flow:

1. Handle nn_module_stack correctly now that we are tracing wrapper module
2. module_call_spec needs to get queried from source directly because we are not running the bytecode anymore.
3. Correct input and output handling.

@diff-train-skip-merge

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162183
Approved by: https://github.com/zhxchen17
2025-09-10 20:48:12 +00:00
de05dbc39c Replace export_for_training with export (#162396)
Summary: replace export_for_training with epxort

Test Plan:
CI

Rollback Plan:

Differential Revision: D81935792

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162396
Approved by: https://github.com/angelayi, https://github.com/jerryzh168
2025-09-10 14:19:34 +00:00
dda071587f Revert "Make distributed modules importable even when backend not built (#159889)" (#162568)
This reverts commit a0d026688cd69583d5a4e0c6f3e5fda141a7f4a9.

Revert "Always build USE_DISTRIBUTED. (#160449)"

This reverts commit d80297a6846f1f2c36fd4f19e22919f2abe8fcea.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162568
Approved by: https://github.com/huydhn
2025-09-10 04:29:42 +00:00
60d009267e Revert "testing infra and some fixes (#162183)"
This reverts commit d8b6622bb6a3879d3832ab6cdc26ff4188ea4a2d.

Reverted https://github.com/pytorch/pytorch/pull/162183 on behalf of https://github.com/huydhn due to Failing a test on macos ([comment](https://github.com/pytorch/pytorch/pull/162183#issuecomment-3268922096))
2025-09-09 05:26:32 +00:00
d8b6622bb6 testing infra and some fixes (#162183)
This PR is quite large in that it covers most of rough edges in the new strict export flow:

1. Handle nn_module_stack correctly now that we are tracing wrapper module
2. module_call_spec needs to get queried from source directly because we are not running the bytecode anymore.
3. Correct input and output handling.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162183
Approved by: https://github.com/zhxchen17
ghstack dependencies: #162167
2025-09-09 02:42:11 +00:00
711c8c821e shape guards (#161178)
Summary: This PR introduces shape guards to export. Previously only value ranges,  equalities, and specializations would be tracked for symbolic expressions, and we had a forward hook to check them. Instead now we create a function to check shape guards and call it in the exported program.

Test Plan:
updated several tests

Rollback Plan:

Differential Revision: D80713603

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161178
Approved by: https://github.com/tugsbayasgalan
2025-09-08 22:44:09 +00:00
d80297a684 Always build USE_DISTRIBUTED. (#160449)
Signed-off-by: Edward Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160449
Approved by: https://github.com/wconstab, https://github.com/albanD, https://github.com/dcci
2025-09-08 19:10:36 +00:00
1e0656f063 Revert "Always build USE_DISTRIBUTED. (#160449)"
This reverts commit de893e96c775023aa3be895060848fac3296772c.

Reverted https://github.com/pytorch/pytorch/pull/160449 on behalf of https://github.com/jeanschmidt due to internal changes breaks import checks, see [D81845053](https://www.internalfb.com/diff/D81845053) ([comment](https://github.com/pytorch/pytorch/pull/160449#issuecomment-3264887002))
2025-09-08 07:04:36 +00:00
de893e96c7 Always build USE_DISTRIBUTED. (#160449)
Signed-off-by: Edward Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160449
Approved by: https://github.com/wconstab, https://github.com/albanD, https://github.com/dcci
2025-09-05 20:15:11 +00:00
adae7f66aa Revert "Always build USE_DISTRIBUTED. (#160449)"
This reverts commit c37103234afc832dcad307e9016230810957c9d5.

Reverted https://github.com/pytorch/pytorch/pull/160449 on behalf of https://github.com/jeanschmidt due to Breaking internal build rules, see D81756619 ([comment](https://github.com/pytorch/pytorch/pull/160449#issuecomment-3259430011))
2025-09-05 18:58:47 +00:00
5c67426d68 [dynamo] Add support for const prop on .item (#162204)
Fixes some of the errors in https://fb.workplace.com/groups/1028545332188949/permalink/1303030824740397/

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162204
Approved by: https://github.com/williamwen42
2025-09-05 00:28:49 +00:00
c37103234a Always build USE_DISTRIBUTED. (#160449)
Signed-off-by: Edward Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160449
Approved by: https://github.com/wconstab, https://github.com/albanD, https://github.com/dcci
2025-09-04 19:43:17 +00:00
d1a15abfdc export: add explicit decomposition for aten.expand_copy and unit test (#161688)
Fixes #161080
torch.export.export fails with TypeError: expand() got an unexpected keyword argument 'implicit' when calling torch.expand_copy(..., implicit=True). This happened because expand_copy = _make_copy_from_view(aten.expand) register aten. expand as the decomposition path for aten.expand_copy, which doesn’t accept the implicit argument.

I have added an explicit a decomposition for aten.expand_copy in torch/_decomp/decompositions.py to ignore the implicit argument, and a simple unit test to demonstrate the bug being fixed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/161688
Approved by: https://github.com/angelayi, https://github.com/can-gaa-hou
2025-09-04 18:16:56 +00:00
248355faf5 Don't require FakeStore to be passed into fake backend (#162164)
Signed-off-by: Edward Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162164
Approved by: https://github.com/bdhirsh, https://github.com/albanD, https://github.com/wconstab
2025-09-04 16:43:49 +00:00
b7dad7dd49 Revert "Always build USE_DISTRIBUTED. (#160449)"
This reverts commit 90b08643c3a6eb1f3265b7d1388bd76660759f46.

Reverted https://github.com/pytorch/pytorch/pull/160449 on behalf of https://github.com/jeanschmidt due to Already discussed with @ezyang about the internal quirks and errors ([comment](https://github.com/pytorch/pytorch/pull/160449#issuecomment-3254219358))
2025-09-04 15:25:07 +00:00
3c45af079a kill allow_complex_guards_as_runtime_asserts (#161794)
Summary:
[reland]
Since `allow_complex_guards_as_runtime_asserts` is now sync'd with `prefer_deferred_runtime_asserts_over_guards`, we can kill the former (especially since it was a export-only concept).

Test Plan:
updated tests

Rollback Plan:

Differential Revision: D81334984

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161794
Approved by: https://github.com/zhxchen17
2025-09-04 00:17:01 +00:00
92576a594b Prototype for building non-strict leak detector (#160456)
Summary:
Our strategy for detecting fake tensor leakage in non-strict for outside scope (side effects happening outside of model.forward) is:
1. We do gc.collect() before export and get the alive fake tensors
2. We dump the proxy to fake tensor map from make_fx tracer
3. We query gc again to get alive fake tensors
4. We take the delta between (1) and (3)
5. Filter out fake tensors that are:
    1. Associated with `TrackedFake` (input tracking thing in symbolic_shapes)
    2. Associated with `gm.meta`
6. Do ID match with the proxies and emit their stacktraces.

We rely on (https://github.com/pytorch/pytorch/pull/159923) for other sources of leakages such as:
1. We failed to proxy an operator (like param.data)
2. We cache some tensor in model.forward (https://github.com/pytorch/pytorch/issues/155114)

In general, we notice `gc.collect()` and query-ing gc for live objects are kinda slow. So we turn on this feature under env variable. We should document on export public facing documents that if you run into weird errors regarding fake tensors, they should look into turning on this env variable for further analysis.

Test Plan:
Test plan

Rollback Plan:

Differential Revision: D80003204

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160456
Approved by: https://github.com/pianpwk
2025-09-03 19:21:27 +00:00
90b08643c3 Always build USE_DISTRIBUTED. (#160449)
Signed-off-by: Edward Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160449
Approved by: https://github.com/wconstab, https://github.com/albanD, https://github.com/dcci
2025-09-03 07:33:55 +00:00
4e42aa8ffc Revert "Always build USE_DISTRIBUTED. (#160449)"
This reverts commit b7034e9c924412bfbe8ee25a22d7e95239b5ca65.

Reverted https://github.com/pytorch/pytorch/pull/160449 on behalf of https://github.com/jeanschmidt due to Breaking internal builds, can't be landed with forward fix due to internal tooling problems ([comment](https://github.com/pytorch/pytorch/pull/160449#issuecomment-3246689684))
2025-09-02 20:28:42 +00:00
b7034e9c92 Always build USE_DISTRIBUTED. (#160449)
Signed-off-by: Edward Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160449
Approved by: https://github.com/wconstab, https://github.com/albanD, https://github.com/dcci
2025-09-01 23:00:21 +00:00
9b67d8e344 Revert "[RELAND] Close some sources of fake tensor leakage (#161589)"
This reverts commit 5790b009751e6ebba35d3e6d05e7c1b135553eee.

Reverted https://github.com/pytorch/pytorch/pull/161589 on behalf of https://github.com/atalman due to [GH job link](https://github.com/pytorch/pytorch/actions/runs/17305150611/job/49128381649) [HUD commit link](5790b00975) ([comment](https://github.com/pytorch/pytorch/pull/161589#issuecomment-3235224249))
2025-08-28 23:19:36 +00:00
47742081c9 Revert "kill allow_complex_guards_as_runtime_asserts (#160198)"
This reverts commit 69d91b94ba5366f4444d8cb8fd3dab4de4f04d3d.

Reverted https://github.com/pytorch/pytorch/pull/160198 on behalf of https://github.com/jeffdaily due to let's revert again instead of waiting for forward fix, see earlier comments ([comment](https://github.com/pytorch/pytorch/pull/160198#issuecomment-3235165462))
2025-08-28 22:50:37 +00:00
69d91b94ba kill allow_complex_guards_as_runtime_asserts (#160198)
Summary: Since `allow_complex_guards_as_runtime_asserts` is now sync'd with `prefer_deferred_runtime_asserts_over_guards`, we can kill the former (especially since it was a export-only concept).

Test Plan:
updated tests

Rollback Plan:

Differential Revision: D79903317

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160198
Approved by: https://github.com/ezyang
2025-08-28 19:36:19 +00:00
a8270dd124 Revert "kill allow_complex_guards_as_runtime_asserts (#160198)"
This reverts commit 196232bb935cb346f143d5c39e9a73c44121a033.

Reverted https://github.com/pytorch/pytorch/pull/160198 on behalf of https://github.com/atalman due to dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_compile_selective_checkpoint_triton_kernel_cuda [GH job link](https://github.com/pytorch/pytorch/actions/runs/17289619543/job/49074475338) [HUD commit link](196232bb93) ([comment](https://github.com/pytorch/pytorch/pull/160198#issuecomment-3234013520))
2025-08-28 15:40:37 +00:00
5790b00975 [RELAND] Close some sources of fake tensor leakage (#161589)
Reland of https://github.com/pytorch/pytorch/pull/159923

Couple of fixes:
1. When we run into an operation we didn't proxy, we end up emitting fake constants. We detect this and warn using the FQN of the lifted constant. We warn because some internal users complained it was regressing their exportability.

2. Previous attribute mutation detection logic in non-strict didn't account for nested module structure. This fixes silent incorrectness issue of exporting esm and qwen in non-strict

3. We modify yolov3 to fix the previous silent incorrect behaviour
4. We use strict export for levit_128 because it errors in non-strict due to more strict side effect checking

When upgrading torchbench pin, opacus_cifar10 seems to not run on eager anymore. I verified this by pushing a temporary PR on master with new pin. So i added it to expect_fail list.

Differential Revision: [D81133908](https://our.internmc.facebook.com/intern/diff/D81133908)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/161589
Approved by: https://github.com/avikchaudhuri
2025-08-28 09:46:42 +00:00
196232bb93 kill allow_complex_guards_as_runtime_asserts (#160198)
Summary: Since `allow_complex_guards_as_runtime_asserts` is now sync'd with `prefer_deferred_runtime_asserts_over_guards`, we can kill the former (especially since it was a export-only concept).

Test Plan:
updated tests

Rollback Plan:

Differential Revision: D79903317

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160198
Approved by: https://github.com/ezyang
2025-08-28 07:59:29 +00:00
12c0cf3fab switch prefer_deferred_runtime_asserts_over_guards in export (#160111)
Summary:
In preparation for checking shape guards in export, this PR effectively switches `prefer_deferred_runtime_asserts_over_guards` to `False`, matching Dynamo.

Actually that's a lie: we switch it to `allow_complex_guards_as_runtime_asserts`, which is `False` by default but can be controlled via an internally API to be `True`. This makes the two flags synchronized, so we should be able to kill `allow_complex_guards_as_runtime_asserts` at this point.

Test Plan:
updated tests

Rollback Plan:

Differential Revision: D79734206

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160111
Approved by: https://github.com/tugsbayasgalan
2025-08-27 22:51:10 +00:00
4d078cfc4e [fx] Add is_fx_symbolic_tracing flag (#161385)
Fixes https://github.com/pytorch/pytorch/issues/135276

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161385
Approved by: https://github.com/pianpwk
2025-08-26 22:26:27 +00:00
ffa1ce7650 Fix the parity of original and exported module parameters (#160600)
## Problem
Fixing parameter mismatch issue during torch.export with strict mode (see "How to reproduce the issue" section below):

When there are two attribute mapping to the same tensor, the strict mode will
1. Have a standard param buffer table to standardize the name (bug happens [here](f861dc1826/torch/export/_trace.py (L356))! when 2 parameter have same id(param), the latter name will overwrite the previous name)
2. [Update](f861dc1826/torch/export/_trace.py (L1481)) exported signature with updated standard FQN (problematic)
3. When getting exported_program.module(), it will call [_unlift_exported_program_lifted_states](f861dc1826/torch/export/exported_program.py (L1297)) to recover attribute from exported signature where the parameter name is defined and standardized
Then the named_parameter of this module will have overwritten name instead of original name

## How to reproduce the issue?
reproduce issue shared by @taotaohuang001

torch version: 2.8.0
```python
import torch
from torch import nn

# ---- Toy model with embedding weight sharing (aliasing) ----
class Toy(nn.Module):
    def __init__(self):
        super().__init__()
        self.embedding_layers = nn.ModuleDict()
        tbl = nn.Embedding(100, 8)
        self.embedding_layers["ActorId"] = tbl
        # Alias: reuse the SAME module instance for another feature
        self.embedding_layers["RootActorId"] = self.embedding_layers["ActorId"]
        self.proj = nn.Linear(16, 1)

    def forward(self, feats: dict[str, torch.Tensor]):
        e1 = self.embedding_layers["ActorId"](feats["ActorId"])
        e2 = self.embedding_layers["RootActorId"](feats["RootActorId"])
        return self.proj(torch.cat([e1, e2], dim=-1))

torch.manual_seed(0)

m = Toy().eval()

# Show pre-export parameter names (canonicalized; shared weight appears once)
print("PRE-EXPORT named_parameters:")
print([name for name, _ in m.named_parameters()])

# Sanity: the two feature names point to the same weight object
w1 = m.embedding_layers["ActorId"].weight
w2 = m.embedding_layers["RootActorId"].weight
print("PRE-EXPORT alias -> same object:", w1 is w2, "| same storage:", w1.data_ptr() == w2.data_ptr())

# Example inputs (dict structure will be captured by export)
ex_in = {
    "ActorId":     torch.randint(0, 100, (4,)),
    "RootActorId": torch.randint(0, 100, (4,)),
}

# ---- Export (in memory) and materialize the runnable module ----
ep = torch.export.export(m, (ex_in,), strict=True)
gm = ep.module()  # GraphModule with new (canonical) parameter names

print("\nPOST-EXPORT named_parameters (GraphModule):")
post_names = [name for name, _ in gm.named_parameters()]
print(post_names)

# Prove alias persists after export: run fwd/bwd and check a single grad tensor exists
out = gm(ex_in).sum()
out.backward()

# Find the embedding weight in the exported module by shape (100, 8)
emb_names = [name for name, p in gm.named_parameters() if p.shape == torch.Size([100, 8])]
print("\nEmbedding param (post-export) canonical name:", emb_names[0] if emb_names else "<not found>")

# Show that only one grad exists for the shared table
for name, p in gm.named_parameters():
    if p.grad is not None and p.shape == torch.Size([100, 8]):
        print("Grad present on shared embedding weight:", name, "| grad shape:", tuple(p.grad.shape))
        break

```

And you will see parameters are different before and after export
```
PRE-EXPORT named_parameters:
['embedding_layers.ActorId.weight', 'proj.weight', 'proj.bias']
PRE-EXPORT alias -> same object: True | same storage: True

POST-EXPORT named_parameters (GraphModule):
['embedding_layers.RootActorId.weight', 'proj.weight', 'proj.bias']

Embedding param (post-export) canonical name: embedding_layers.RootActorId.weight
Grad present on shared embedding weight: embedding_layers.RootActorId.weight | grad shape: (100, 8)

```
## Solution
Fixing this issue by making sure latter named parameter will not overwrite the `param_buffer_table` when original model's named parameter already maps to certain parameter.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/160600
Approved by: https://github.com/angelayi
2025-08-25 19:40:06 +00:00
3f1a97a99c Revert "[dynamic shapes] unbacked-safe slicing (#157944)"
This reverts commit 44549c7146bd6c4166f97e856037babe1b7f4f49.

Reverted https://github.com/pytorch/pytorch/pull/157944 on behalf of https://github.com/pianpwk due to this PR & internal diff landed out of sync, just reverted internal with D80720654, will revert this & reland as codev ([comment](https://github.com/pytorch/pytorch/pull/157944#issuecomment-3215610135))
2025-08-22 20:48:46 +00:00
981ac533c6 Revert "Close some sources of fake tensor leakages (#159923)"
This reverts commit 5afa4187dfe1e99278f8e372ec09102d5b937572.

Reverted https://github.com/pytorch/pytorch/pull/159923 on behalf of https://github.com/zou3519 due to broke aoti test in inductor periodic ([comment](https://github.com/pytorch/pytorch/pull/159923#issuecomment-3215580688))
2025-08-22 20:42:50 +00:00
44549c7146 [dynamic shapes] unbacked-safe slicing (#157944)
Generates new unbacked symbols for slice output size & storage offset, when appropriate semantics are unclear. Teaches inductor to codegen the slice with flexible semantics.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157944
Approved by: https://github.com/laithsakka
2025-08-20 22:52:56 +00:00
5afa4187df Close some sources of fake tensor leakages (#159923)
Differential Revision: D79694055

Couple of fixes:
1. When we run into an operation we didn't proxy, we end up emitting fake constants. We detect this and error using the FQN of the lifted constant
2. Previous attribute mutation detection logic in non-strict didn't account for nested module structure. This fixes silent incorrectness issue of exporting esm and qwen in non-strict
3. We modify yolov3 to fix the previous silent incorrect behaviour

When upgrading torchbench pin, opacus_cifar10 seems to not run on eager anymore. I verified this by pushing a temporary PR on master with new pin. So i added it to expect_fail list.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159923
Approved by: https://github.com/avikchaudhuri
2025-08-20 22:24:23 +00:00