Commit Graph

710 Commits

Author SHA1 Message Date
649ceda8a5 [export] handling NamedTuple inputs (#162959)
Fixes #160547
### Summary:
bug
```
    def test_namedtuple(self):
        from collections import namedtuple
        Point = namedtuple('Point', 'x y')

        class M(torch.nn.Module):
            def forward(self, x, y):
                return x + y

        inp = Point(torch.ones(3), torch.ones(3))
        print(M()(*inp))

        # errors
        ep = torch.export.export(M(), inp, strict=False)
        print(ep)

        # succeeds
        ep = torch.export.export(M(), inp, strict=True)
        print(ep)

        # workaround could be to convert namedtuple to a kwarg
        inp_kwargs =  {field: getattr(inp, field) for field in inp._fields}
        ep = torch.export.export(M(), (), inp_kwargs)
        print(ep)
```
FIx :
namedtuple is subclass of tuple
but namedtuple is not expected
So, this change handles named tuple case

I have added 🧪 test case for this as well
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162959
Approved by: https://github.com/angelayi

Co-authored-by: Angela Yi <angelayi@meta.com>
2025-09-23 17:43:50 +00:00
720a7b2887 [export] Remove .contiguous() when saving weights to raw bytes (#163587)
Summary: `.contiguous()` will discard the original storage size of the tensor, and could lead to issues during loading.

Test Plan:
buck2 run mode/dev-nosan caffe2/test:test_export -- -r test_1D_tensor_slicing
buck2 run mode/dev-nosan caffe2/test:test_export -- -r test_2D_tensor_slicing

Differential Revision: D83016250

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163587
Approved by: https://github.com/angelayi
2025-09-23 15:44:56 +00:00
b756b580fb Improve fake tensor leakage detection in export by not relying on gc too much (#163516)
Previously we relied on gc to get the snapshot of fake tensors before and after export to get list of fake tensors that are created during export. This caused some flakiness in our test suite (https://github.com/pytorch/pytorch/issues/162232). it seems super hard to make gc deterministic, so we just instrument fake tensor creation which seems lot better. In addition, it is also quite faster than previous approach becuase we are no longer manually triggering garbage collector.

Differential Revision: [D82966648](https://our.internmc.facebook.com/intern/diff/D82966648)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163516
Approved by: https://github.com/ezyang
2025-09-22 22:04:24 +00:00
0e9f9c3a61 Fix inconsistent test and add new tracer as config (#162558)
It is better to have the new tracer as global config that can be manipulated easily. Also I believe dynamo-like config infra is useful instead of relying on custom way of patching stuff.

Differential Revision: [D82478649](https://our.internmc.facebook.com/intern/diff/D82478649)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162558
Approved by: https://github.com/zhxchen17
ghstack dependencies: #162557
2025-09-17 17:01:48 +00:00
489860f3c2 Prefer_deferred_runtime_asserts should be propagated to new tracer (#162556)
Differential Revision: [D82478650](https://our.internmc.facebook.com/intern/diff/D82478650)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162556
Approved by: https://github.com/avikchaudhuri
ghstack dependencies: #162487
2025-09-16 21:25:00 +00:00
a4e74f416b Fix error message (#162487)
More proper fix here should be that we directly replace shape_env with correct sources but it is bit involved as we have to manually construct dynamo sources by hand (need to handle list/dict etc) but it is quite easy if we are operating on a string so i do this as post-processing step for now.

Differential Revision: [D82478647](https://our.internmc.facebook.com/intern/diff/D82478647)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162487
Approved by: https://github.com/zhxchen17
2025-09-16 19:06:30 +00:00
543d50db2b Fix torch export with dict input nested in args (#162618)
Investigated together with @pyemma and @taotaohuang001

## Problem
when calling exported module with dict nested in the args tuple, it will make following complaits
```
Traceback (most recent call last):
  File "/home/chzhu/infinitrain/test_torch_export.py", line 32, in <module>
    print(exported_model({"a2": torch.randn(10), "a1": torch.randn(10)}))
  File "/home/chzhu/infinitrain/build/infinitrain/environments/development-venv/lib/python3.10/site-packages/torch/fx/graph_module.py", line 848, in call_wrapped
    return self._wrapped_call(self, *args, **kwargs)
  File "/home/chzhu/infinitrain/build/infinitrain/environments/development-venv/lib/python3.10/site-packages/torch/fx/graph_module.py", line 424, in __call__
    raise e
  File "/home/chzhu/infinitrain/build/infinitrain/environments/development-venv/lib/python3.10/site-packages/torch/fx/graph_module.py", line 411, in __call__
    return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
  File "/home/chzhu/infinitrain/build/infinitrain/environments/development-venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/chzhu/infinitrain/build/infinitrain/environments/development-venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1879, in _call_impl
    return inner()
  File "/home/chzhu/infinitrain/build/infinitrain/environments/development-venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1806, in inner
    args_kwargs_result = hook(self, args, kwargs)  # type: ignore[misc]
  File "/home/chzhu/infinitrain/build/infinitrain/environments/development-venv/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 929, in _fn
    return fn(*args, **kwargs)
  File "/home/chzhu/infinitrain/build/infinitrain/environments/development-venv/lib/python3.10/site-packages/torch/export/_unlift.py", line 81, in _check_input_constraints_pre_hook
    flat_args_with_path = _check_inputs_match(args, kwargs, self._in_spec)
  File "/home/chzhu/infinitrain/build/infinitrain/environments/development-venv/lib/python3.10/site-packages/torch/export/_unlift.py", line 64, in _check_inputs_match
    raise ValueError(  # noqa: B904
ValueError: Trying to flatten user inputs with exported input tree spec:
TreeSpec(tuple, None, [TreeSpec(tuple, None, [TreeSpec(dict, ['a1', 'a2'], [*,
      *])]),
  TreeSpec(dict, [], [])])
but actually got inputs with tree spec of:
TreeSpec(tuple, None, [TreeSpec(tuple, None, [TreeSpec(dict, ['a2', 'a1'], [*,
      *])]),
  TreeSpec(dict, [], [])]).
Please check that the inputs have the same number and type of args and kwargs as the ones you used when tracing.

```

## How to reproduce the issue
```python
import torch

# create a nn.Module with data_batch as input and output as output
class MyModel(torch.nn.Module):
   def __init__(self):
       super(MyModel, self).__init__()
       self.linear = torch.nn.Linear(10, 1)

   def forward(self, data_batch):
       h1 = self.linear(data_batch["a1"])
       h2 = self.linear(data_batch["a2"])
       return h1 + h2

# torch export this module
model = MyModel()
example_args_forward = (
   {
       "a1": torch.randn(10),
       "a2": torch.randn(10),
   },
)
exported_model = torch.export.export(model, example_args_forward, strict=True)

# save the exported model
torch.export.save(exported_model, "exported_model.pt2")

# load the exported model
exported_model = torch.export.load("exported_model.pt2").module()

# run the exported model
print(exported_model({"a2": torch.randn(10), "a1": torch.randn(10)}))

```

## Root Cause
Input spec is encoded as [TreeSpec](582d278983/torch/utils/_pytree.py (L1059)) in torch export. With (args, kwargs) at the top level. When we call the exported model, it has a pre-execution [hook](582d278983/torch/export/_unlift.py (L66)) to check the input TreeSpec matches the received TreeSpec, where in Treespec, the dict key order is preserved. Something like

TreeSpec(dict, ['a2', 'a1'], [*,*])

To workaround this, the input check reorders [kwargs](582d278983/torch/export/_unlift.py (L67)), that is why kwargs can be out of order. But the dict nested in the args is not re-ordered, so any re-ordering of the keys will throw errors.

## Solution
Update eq_spec to handle the dict case, where we only guarantee that key set is the same without ordering constraints.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162618
Approved by: https://github.com/angelayi
2025-09-13 03:24:30 +00:00
501e19137a fix var args for shape guards (#162633)
Summary: Fixes #162599

Test Plan:
added test based on repro

Rollback Plan:

Differential Revision: D82144520

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162633
Approved by: https://github.com/tugsbayasgalan
2025-09-12 00:33:35 +00:00
463fbc8ca0 Support vmap + custom autograd function/improve DTensor constructor inefficiency (#162240)
This makes gemma3 exportable on transformers=4.55.4

In HF, there is a torch funciton mode called TransformGetItemToIndex which internally calls custom autograd function. When this custom autograd function is called under vmap, It triggers CustomFunctionHigherOrderOP which error-ed because there was no pre-dispatch proxy mode implementation.

Since there are number of requests lately to add various operators in pre-dispatch IR, I introduce a decorator in export that works similar to `allow_in_graph`. Basically:
1) We intercept custom_autograd_function.apply at pre-dispatch mode when this decorator is applied
2) We apply `flat_apply` HOP to hide the pytree spec for this autograd function. Note that this adds restriction that this custom autograd function needs to take in fx-able types.
3) subclass constructor decorator is implemented similarly, so we just refactor it to use similar implementation as this new decorator. eventually we should delete the subclass constructor decorator.
4) Move some code in subclass constructor decorator to exit early in non-export environment which should shave off some inefficiency (around 1% according to @swolchok 's benchmark)

Fixes: https://github.com/pytorch/pytorch/issues/161563#issuecomment-3246309758

Differential Revision: [D82141316](https://our.internmc.facebook.com/intern/diff/D82141316)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162240
Approved by: https://github.com/ydwu4
2025-09-11 17:42:41 +00:00
3f6d88f04c paths to exclude shape guards (#162684)
Summary: Easier to land than https://www.internalfb.com/diff/D82030581

Test Plan:
everything blamed by https://www.internalfb.com/diff/D80713603 (except some old exir tests)

Rollback Plan:

Differential Revision: D82180349

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162684
Approved by: https://github.com/tugsbayasgalan
2025-09-11 15:34:06 +00:00
22df9332da [serialization] Add pte file to archive (#162520)
Summary:
Add _package_executorch_files to archive apis. Allow us to package a PTE file into the archive.

I don't think there's a use-case to have more than one PTE file at the moment, but left it as `EXECUTORCH_FILES` just in case.

Test Plan:
Tested in D81992612

Rollback Plan:

Differential Revision: D81977483

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162520
Approved by: https://github.com/angelayi
2025-09-11 07:59:11 +00:00
6d65737aee testing infra and some fixes (#162183)
This PR is quite large in that it covers most of rough edges in the new strict export flow:

1. Handle nn_module_stack correctly now that we are tracing wrapper module
2. module_call_spec needs to get queried from source directly because we are not running the bytecode anymore.
3. Correct input and output handling.

@diff-train-skip-merge

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162183
Approved by: https://github.com/zhxchen17
2025-09-10 20:48:12 +00:00
60d009267e Revert "testing infra and some fixes (#162183)"
This reverts commit d8b6622bb6a3879d3832ab6cdc26ff4188ea4a2d.

Reverted https://github.com/pytorch/pytorch/pull/162183 on behalf of https://github.com/huydhn due to Failing a test on macos ([comment](https://github.com/pytorch/pytorch/pull/162183#issuecomment-3268922096))
2025-09-09 05:26:32 +00:00
d8b6622bb6 testing infra and some fixes (#162183)
This PR is quite large in that it covers most of rough edges in the new strict export flow:

1. Handle nn_module_stack correctly now that we are tracing wrapper module
2. module_call_spec needs to get queried from source directly because we are not running the bytecode anymore.
3. Correct input and output handling.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162183
Approved by: https://github.com/zhxchen17
ghstack dependencies: #162167
2025-09-09 02:42:11 +00:00
711c8c821e shape guards (#161178)
Summary: This PR introduces shape guards to export. Previously only value ranges,  equalities, and specializations would be tracked for symbolic expressions, and we had a forward hook to check them. Instead now we create a function to check shape guards and call it in the exported program.

Test Plan:
updated several tests

Rollback Plan:

Differential Revision: D80713603

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161178
Approved by: https://github.com/tugsbayasgalan
2025-09-08 22:44:09 +00:00
5211f1f908 [export] Move example inputs in move_to_device_pass (#162301)
Summary:
If i have a EP that's exported on CPU and want to AOTI compile it for CUDA. I need to use `move_to_device_pass`.

But in `torch._inductor.aoti_compile_and_package()`, it directly uses the `example_inputs` attached to the EP, so we should move the example inputs as well if applicable.

Test Plan:
buck2 run mode/dev-nosan caffe2/test:test_export -- -r test_move_device_example_inputs

Rollback Plan:

Differential Revision: D81812366

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162301
Approved by: https://github.com/angelayi
2025-09-06 23:54:54 +00:00
09be1890d7 [export] Fix torch.export.load with storage offset (#162172)
Summary: As titled

Test Plan:
CI

Rollback Plan:

Differential Revision: D81687701

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162172
Approved by: https://github.com/angelayi
2025-09-04 22:50:33 +00:00
3c45af079a kill allow_complex_guards_as_runtime_asserts (#161794)
Summary:
[reland]
Since `allow_complex_guards_as_runtime_asserts` is now sync'd with `prefer_deferred_runtime_asserts_over_guards`, we can kill the former (especially since it was a export-only concept).

Test Plan:
updated tests

Rollback Plan:

Differential Revision: D81334984

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161794
Approved by: https://github.com/zhxchen17
2025-09-04 00:17:01 +00:00
c465b3d52c [2/n][export] Refactor PT2 Archive weight saving and loading (#161520)
Summary:
The saving (serialization) part of PT2 archive weight refactoring.
The loading (deserialization part) has been landed in D80035490

Test Plan:
CI

Rollback Plan:

bifferential Revision: D80970931

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161520
Approved by: https://github.com/SherlockNoMad
2025-09-03 20:12:49 +00:00
92576a594b Prototype for building non-strict leak detector (#160456)
Summary:
Our strategy for detecting fake tensor leakage in non-strict for outside scope (side effects happening outside of model.forward) is:
1. We do gc.collect() before export and get the alive fake tensors
2. We dump the proxy to fake tensor map from make_fx tracer
3. We query gc again to get alive fake tensors
4. We take the delta between (1) and (3)
5. Filter out fake tensors that are:
    1. Associated with `TrackedFake` (input tracking thing in symbolic_shapes)
    2. Associated with `gm.meta`
6. Do ID match with the proxies and emit their stacktraces.

We rely on (https://github.com/pytorch/pytorch/pull/159923) for other sources of leakages such as:
1. We failed to proxy an operator (like param.data)
2. We cache some tensor in model.forward (https://github.com/pytorch/pytorch/issues/155114)

In general, we notice `gc.collect()` and query-ing gc for live objects are kinda slow. So we turn on this feature under env variable. We should document on export public facing documents that if you run into weird errors regarding fake tensors, they should look into turning on this env variable for further analysis.

Test Plan:
Test plan

Rollback Plan:

Differential Revision: D80003204

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160456
Approved by: https://github.com/pianpwk
2025-09-03 19:21:27 +00:00
d153af713e [ez] Improve formatting in error messages for dynamic shapes (#161573)
Show the repr of `dim` to make the message more clear. Example: before `but got batch instead`, after `but got "batch" instead`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161573
Approved by: https://github.com/angelayi
2025-08-28 23:52:58 +00:00
9b67d8e344 Revert "[RELAND] Close some sources of fake tensor leakage (#161589)"
This reverts commit 5790b009751e6ebba35d3e6d05e7c1b135553eee.

Reverted https://github.com/pytorch/pytorch/pull/161589 on behalf of https://github.com/atalman due to [GH job link](https://github.com/pytorch/pytorch/actions/runs/17305150611/job/49128381649) [HUD commit link](5790b00975) ([comment](https://github.com/pytorch/pytorch/pull/161589#issuecomment-3235224249))
2025-08-28 23:19:36 +00:00
47742081c9 Revert "kill allow_complex_guards_as_runtime_asserts (#160198)"
This reverts commit 69d91b94ba5366f4444d8cb8fd3dab4de4f04d3d.

Reverted https://github.com/pytorch/pytorch/pull/160198 on behalf of https://github.com/jeffdaily due to let's revert again instead of waiting for forward fix, see earlier comments ([comment](https://github.com/pytorch/pytorch/pull/160198#issuecomment-3235165462))
2025-08-28 22:50:37 +00:00
69d91b94ba kill allow_complex_guards_as_runtime_asserts (#160198)
Summary: Since `allow_complex_guards_as_runtime_asserts` is now sync'd with `prefer_deferred_runtime_asserts_over_guards`, we can kill the former (especially since it was a export-only concept).

Test Plan:
updated tests

Rollback Plan:

Differential Revision: D79903317

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160198
Approved by: https://github.com/ezyang
2025-08-28 19:36:19 +00:00
a8270dd124 Revert "kill allow_complex_guards_as_runtime_asserts (#160198)"
This reverts commit 196232bb935cb346f143d5c39e9a73c44121a033.

Reverted https://github.com/pytorch/pytorch/pull/160198 on behalf of https://github.com/atalman due to dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_compile_selective_checkpoint_triton_kernel_cuda [GH job link](https://github.com/pytorch/pytorch/actions/runs/17289619543/job/49074475338) [HUD commit link](196232bb93) ([comment](https://github.com/pytorch/pytorch/pull/160198#issuecomment-3234013520))
2025-08-28 15:40:37 +00:00
5790b00975 [RELAND] Close some sources of fake tensor leakage (#161589)
Reland of https://github.com/pytorch/pytorch/pull/159923

Couple of fixes:
1. When we run into an operation we didn't proxy, we end up emitting fake constants. We detect this and warn using the FQN of the lifted constant. We warn because some internal users complained it was regressing their exportability.

2. Previous attribute mutation detection logic in non-strict didn't account for nested module structure. This fixes silent incorrectness issue of exporting esm and qwen in non-strict

3. We modify yolov3 to fix the previous silent incorrect behaviour
4. We use strict export for levit_128 because it errors in non-strict due to more strict side effect checking

When upgrading torchbench pin, opacus_cifar10 seems to not run on eager anymore. I verified this by pushing a temporary PR on master with new pin. So i added it to expect_fail list.

Differential Revision: [D81133908](https://our.internmc.facebook.com/intern/diff/D81133908)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/161589
Approved by: https://github.com/avikchaudhuri
2025-08-28 09:46:42 +00:00
196232bb93 kill allow_complex_guards_as_runtime_asserts (#160198)
Summary: Since `allow_complex_guards_as_runtime_asserts` is now sync'd with `prefer_deferred_runtime_asserts_over_guards`, we can kill the former (especially since it was a export-only concept).

Test Plan:
updated tests

Rollback Plan:

Differential Revision: D79903317

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160198
Approved by: https://github.com/ezyang
2025-08-28 07:59:29 +00:00
12c0cf3fab switch prefer_deferred_runtime_asserts_over_guards in export (#160111)
Summary:
In preparation for checking shape guards in export, this PR effectively switches `prefer_deferred_runtime_asserts_over_guards` to `False`, matching Dynamo.

Actually that's a lie: we switch it to `allow_complex_guards_as_runtime_asserts`, which is `False` by default but can be controlled via an internally API to be `True`. This makes the two flags synchronized, so we should be able to kill `allow_complex_guards_as_runtime_asserts` at this point.

Test Plan:
updated tests

Rollback Plan:

Differential Revision: D79734206

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160111
Approved by: https://github.com/tugsbayasgalan
2025-08-27 22:51:10 +00:00
8ff9485815 [export] Update unflattening dynamo.disable (#161306)
Summary:
Doing inline disabling causes recompiles with the reason "Cache line
invalidated because L['___stack0'] got deallocated"

Test Plan:
CI

Rollback Plan:

Differential Revision: D80816956

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161306
Approved by: https://github.com/pianpwk
2025-08-27 00:27:16 +00:00
dbc903a94a [APS IR] Minfor fix - use GetAttrKey in get_keystr to match with flat args path in unflatten (#161453)
Summary: While passing path info to [_check_input_constraints_for_graph](https://www.internalfb.com/code/fbsource/[6b5b2dc35902a26ce265e3c0ae5189a3faba1d38]/fbcode/caffe2/torch/export/unflatten.py?lines=594), GetAttrKey is used to specify path str. To match with that get_keystr should also use GetAttrKey.

Test Plan:
Existing tests

```
buck run mode/opt caffe2/test:test_export -- -r unflatten
```

```
Ran 413 tests in 204.533s

OK (skipped=1, expected failures=13)
```

Rollback Plan:

Differential Revision: D80984083

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161453
Approved by: https://github.com/tugsbayasgalan
2025-08-27 00:05:20 +00:00
4d078cfc4e [fx] Add is_fx_symbolic_tracing flag (#161385)
Fixes https://github.com/pytorch/pytorch/issues/135276

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161385
Approved by: https://github.com/pianpwk
2025-08-26 22:26:27 +00:00
089ad1d88b [1/n][export] Refactor PT2 Archive weight saving and loading (#160394)
Summary:

We split the refactoring in two parts for forward compatibility concerns
First, we land the deserialization (loading part)
Then, we land the serialization (saving part)

Save weights and constants as individual files in PT2 archive. Each weight/constant will be saved as raw bytes, unless it is a custom object (TorchBind object) or a non-fake tensor subclass, for these two special cases we still save them using pickle.

The metadata of saved tensors along with the file name will be saved as `PayloadMeta`.
The mapping from FQN to `PayloadMeta` will be saved as `PayloadConfig` under `WEIGHTS_CONFIG_FORMAT` and `CONTANTS_CONFIG_FORMAT`

This changes the serialization in python side when calling `torch.export.save()`.

For deserialization in python `torch.export.load()`, we make it BC-safe by allowing loading legacy format weights/constants.

For deserialization in C++ `torch/nativert/ModelRunner.cpp`, we make this a BC breaking change as currently the OSS ModelRunner API is not being used.

The file structure

```
├── archive_format
├── archive_version
├── byteorder
├── .data
│   ├── serialization_id
│   └── version
├── data
│   ├── sample_inputs
│   │   └── model.pt
│   ├── constants
│   │   ├── tensor_0
│   │   ├── tensor_1
│   │   └── model_constants_config.json
│   └── weights
│       ├── weight_0
│       ├── weight_1
│       ├── weight_2
│       ├── weight_3
│       └── model_weights_config.json
└── models
    └── model.json
```

Test Plan:
CI

Rollback Plan:

Differential Revision: D80035490

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160394
Approved by: https://github.com/SherlockNoMad
2025-08-26 01:15:42 +00:00
ffa1ce7650 Fix the parity of original and exported module parameters (#160600)
## Problem
Fixing parameter mismatch issue during torch.export with strict mode (see "How to reproduce the issue" section below):

When there are two attribute mapping to the same tensor, the strict mode will
1. Have a standard param buffer table to standardize the name (bug happens [here](f861dc1826/torch/export/_trace.py (L356))! when 2 parameter have same id(param), the latter name will overwrite the previous name)
2. [Update](f861dc1826/torch/export/_trace.py (L1481)) exported signature with updated standard FQN (problematic)
3. When getting exported_program.module(), it will call [_unlift_exported_program_lifted_states](f861dc1826/torch/export/exported_program.py (L1297)) to recover attribute from exported signature where the parameter name is defined and standardized
Then the named_parameter of this module will have overwritten name instead of original name

## How to reproduce the issue?
reproduce issue shared by @taotaohuang001

torch version: 2.8.0
```python
import torch
from torch import nn

# ---- Toy model with embedding weight sharing (aliasing) ----
class Toy(nn.Module):
    def __init__(self):
        super().__init__()
        self.embedding_layers = nn.ModuleDict()
        tbl = nn.Embedding(100, 8)
        self.embedding_layers["ActorId"] = tbl
        # Alias: reuse the SAME module instance for another feature
        self.embedding_layers["RootActorId"] = self.embedding_layers["ActorId"]
        self.proj = nn.Linear(16, 1)

    def forward(self, feats: dict[str, torch.Tensor]):
        e1 = self.embedding_layers["ActorId"](feats["ActorId"])
        e2 = self.embedding_layers["RootActorId"](feats["RootActorId"])
        return self.proj(torch.cat([e1, e2], dim=-1))

torch.manual_seed(0)

m = Toy().eval()

# Show pre-export parameter names (canonicalized; shared weight appears once)
print("PRE-EXPORT named_parameters:")
print([name for name, _ in m.named_parameters()])

# Sanity: the two feature names point to the same weight object
w1 = m.embedding_layers["ActorId"].weight
w2 = m.embedding_layers["RootActorId"].weight
print("PRE-EXPORT alias -> same object:", w1 is w2, "| same storage:", w1.data_ptr() == w2.data_ptr())

# Example inputs (dict structure will be captured by export)
ex_in = {
    "ActorId":     torch.randint(0, 100, (4,)),
    "RootActorId": torch.randint(0, 100, (4,)),
}

# ---- Export (in memory) and materialize the runnable module ----
ep = torch.export.export(m, (ex_in,), strict=True)
gm = ep.module()  # GraphModule with new (canonical) parameter names

print("\nPOST-EXPORT named_parameters (GraphModule):")
post_names = [name for name, _ in gm.named_parameters()]
print(post_names)

# Prove alias persists after export: run fwd/bwd and check a single grad tensor exists
out = gm(ex_in).sum()
out.backward()

# Find the embedding weight in the exported module by shape (100, 8)
emb_names = [name for name, p in gm.named_parameters() if p.shape == torch.Size([100, 8])]
print("\nEmbedding param (post-export) canonical name:", emb_names[0] if emb_names else "<not found>")

# Show that only one grad exists for the shared table
for name, p in gm.named_parameters():
    if p.grad is not None and p.shape == torch.Size([100, 8]):
        print("Grad present on shared embedding weight:", name, "| grad shape:", tuple(p.grad.shape))
        break

```

And you will see parameters are different before and after export
```
PRE-EXPORT named_parameters:
['embedding_layers.ActorId.weight', 'proj.weight', 'proj.bias']
PRE-EXPORT alias -> same object: True | same storage: True

POST-EXPORT named_parameters (GraphModule):
['embedding_layers.RootActorId.weight', 'proj.weight', 'proj.bias']

Embedding param (post-export) canonical name: embedding_layers.RootActorId.weight
Grad present on shared embedding weight: embedding_layers.RootActorId.weight | grad shape: (100, 8)

```
## Solution
Fixing this issue by making sure latter named parameter will not overwrite the `param_buffer_table` when original model's named parameter already maps to certain parameter.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/160600
Approved by: https://github.com/angelayi
2025-08-25 19:40:06 +00:00
cd31be28ec Reland D80238201: [Torch.Export] Add flat arg paths in error message (#160919)
Summary:
[The diff was reverted due to CLA error, in the process of retrieving account]
Previous error message
```
RuntimeError: Expected input at *args.<unknown location>.shape[0] to be equal to 4096, but got 7680. If you meant for this dimension to be dynamic, please re-export and specify dynamic_shapes (e.g. with Dim.DYNAMIC)
```
New error message
```
RuntimeError: Expected input at *args.[0].supervision_input.weight.shape[0] to be equal to 4096, but got 7680. If you meant for this dimension to be dynamic, please re-export and specify dynamic_shapes (e.g. with Dim.DYNAMIC)
```

Test Plan:
```
buck test mode/opt apf/rec/ir/tests:ir_export_deserialize_test
```
https://www.internalfb.com/intern/testinfra/testrun/4785074906254375

```
buck run mode/opt caffe2/test:test_export -- -r unflatten
```

```
Ran 413 tests in 208.414s

OK (skipped=1, expected failures=13)
```

Rollback Plan:

Differential Revision: D80487367

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160919
Approved by: https://github.com/angelayi
2025-08-23 07:20:58 +00:00
981ac533c6 Revert "Close some sources of fake tensor leakages (#159923)"
This reverts commit 5afa4187dfe1e99278f8e372ec09102d5b937572.

Reverted https://github.com/pytorch/pytorch/pull/159923 on behalf of https://github.com/zou3519 due to broke aoti test in inductor periodic ([comment](https://github.com/pytorch/pytorch/pull/159923#issuecomment-3215580688))
2025-08-22 20:42:50 +00:00
c60dea5261 [export] Allow tempfile._TemporaryFileWrapper in package_pt2 (#161203)
Summary:
We use tempfile.NamedTemporaryFile to create a temporary pt2 file in `test_nativert.py`

However, it is not recognized as an allowed file format and a warning will be thrown.

Test Plan:
CI

Rollback Plan:

Differential Revision: D80740916

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161203
Approved by: https://github.com/angelayi
2025-08-22 04:10:35 +00:00
5afa4187df Close some sources of fake tensor leakages (#159923)
Differential Revision: D79694055

Couple of fixes:
1. When we run into an operation we didn't proxy, we end up emitting fake constants. We detect this and error using the FQN of the lifted constant
2. Previous attribute mutation detection logic in non-strict didn't account for nested module structure. This fixes silent incorrectness issue of exporting esm and qwen in non-strict
3. We modify yolov3 to fix the previous silent incorrect behaviour

When upgrading torchbench pin, opacus_cifar10 seems to not run on eager anymore. I verified this by pushing a temporary PR on master with new pin. So i added it to expect_fail list.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159923
Approved by: https://github.com/avikchaudhuri
2025-08-20 22:24:23 +00:00
1091165826 [export] Update move_to_device_pass for to.device (#160528)
Differential Revision: D80135455

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160528
Approved by: https://github.com/yushangdi
2025-08-18 15:41:48 +00:00
4a773e1e86 Warn when there is side effect in strict mode (#160060)
Differential Revision: [D79784354](https://our.internmc.facebook.com/intern/diff/D79784354)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/160060
Approved by: https://github.com/zhxchen17, https://github.com/StrongerXi
2025-08-14 14:59:44 +00:00
194fcfcfbd Add support for param mutation under inference mode (#159661)
Summary:
In HF model rwkv, we have parameter mutation under inference mode which should be safe. This PR does multiple things to make sure it works:
1. We execute global autograd mutation while tracing so that we can actually trace through parameter inplace mutation
2. Add support for parameter mutation under inference mode in AOTAutograd
3. Add support for parameter mutation under inference mode in export.

Test Plan:
test

Rollback Plan:

Differential Revision: D79460136

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159661
Approved by: https://github.com/ydwu4
2025-08-14 03:34:04 +00:00
11a3565f18 [Torch Native] Add test for packaging weight (#158750)
Add test that require weights to be packaged for torch native

For now, we need `package_weights_in_so=True` for compile standalone. The constants are in a `.o` file and will be added as a source to the CMakeLists.txt of the model.

After we added weight deduping, we should be able to let this config be False.

```
python test/inductor/test_aot_inductor_package.py  -k test_compile_with_exporter_weights
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158750
Approved by: https://github.com/desertfire
2025-08-09 01:04:21 +00:00
e619c6bb90 [export] Apply move_to_device_pass to all submodules (#159992)
Previously we only applied this move_to_device_pass to the toplevel graph. However if we have HOO, this pass will not be applied on the HOO submodules. This PR modifies the pass to run on all submodules.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159992
Approved by: https://github.com/yiming0416
2025-08-07 18:51:15 +00:00
a4b07fe8f6 [AOTI] Add more default options to compile_standalone (#158560)
Summary: When compiling for standalone, make embed_kernel_binary and emit_multi_arch_kernel default to True, and add a default name for model_name_for_generated_files to make the generated cpp project easier to understand. Also improved the weights object file naming to be more readable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158560
Approved by: https://github.com/yushangdi
2025-08-06 15:59:27 +00:00
fb35a9ea4a [export] Improve error messages (#159881)
Originally, if the PT2 errored when loading, we would try to load using the old loader to fit BC issues. However this hides the error messages for if an up-to-date PT2 is erroring when loading due to some other reason.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159881
Approved by: https://github.com/yushangdi
2025-08-05 22:26:48 +00:00
dd95900cec [AOTI] normalize_path_separator file path for Windows. (#159726)
`normalize_path_separator` file path for Windows.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159726
Approved by: https://github.com/angelayi, https://github.com/jansel
2025-08-04 15:57:19 +00:00
e57a92734d [export] Fix nn_module_stack of assert_tensor_metadata nodes (#159625)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159625
Approved by: https://github.com/yushangdi
2025-08-02 02:52:42 +00:00
b1fb552974 Revert "Fix ep deepcopy when there is python builitin name (#159478)"
This reverts commit de7376537f2a11783169fee2b3bc276d266898bf.

Reverted https://github.com/pytorch/pytorch/pull/159478 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/159478#issuecomment-3141228423))
2025-07-31 20:20:53 +00:00
5b2ad9279c [draft export] logging (#159004)
Summary: adds logging for draft export

Test Plan:
loggercli stage actualize-stage TorchDraftExportUsageLoggerConfig

Rollback Plan:

Differential Revision: D78308105

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159004
Approved by: https://github.com/angelayi
2025-07-31 05:52:13 +00:00
de7376537f Fix ep deepcopy when there is python builitin name (#159478)
Summary: title

Test Plan:
CI

Rollback Plan:

Differential Revision: D79261007

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159478
Approved by: https://github.com/pianpwk
2025-07-30 23:14:31 +00:00
2b1ae29960 [Dynamo][Better Engineering] Add typing annotations to guard and source (#158397) (#159491)
Summary:
X-link: https://github.com/pytorch/executorch/pull/12986

As part of better engineering week, we would like to improve out type support to improve dev experience in dynamo

This PR adds strict typing support to a critical set of files for dynamo, `source.py` and the base `_guards.py`

Running
```
mypy torch/_dynamo/source.py torch/_guards.py --linecount-report /tmp/coverage_log
```

| -------- | Lines Unannotated | Lines Total | % lines covered | Funcs Unannotated | Funcs Total | % funcs covered |
| -------- | ------- | -------- | ------- | ------- | ------- | ------- |
| Main  |  1227 | 2208 | 55.57% | 207 | 362 | 57.18% |
| This PR | 2217 | 2217 | 100.00% | 362 | 362 | 100.00% |
| Delta    | +990 | +9 | +44.43% | +155 | 0 | +42.82% |

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 jerryzh168 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben

Test Plan:
Imported from GitHub, without a `Test Plan:` line.

Rollback Plan:

Reviewed By: JacobSzwejbka, yangw-dev

Differential Revision: D79199389

Pulled By: Lucaskabela

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159491
Approved by: https://github.com/anijain2305, https://github.com/yangw-dev
2025-07-30 22:57:50 +00:00