Commit Graph

308 Commits

Author SHA1 Message Date
5ed4270440 remove more no longer needed torch._check_is_size calls 1 (#164630)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164630
Approved by: https://github.com/Skylion007
ghstack dependencies: #164627
2025-10-04 22:06:04 +00:00
02715d0876 [BE][5/6] fix typos in test/ (test/dynamo/) (#157639)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157639
Approved by: https://github.com/yewentao256, https://github.com/jansel
ghstack dependencies: #157638
2025-07-06 06:34:25 +00:00
48e7b62d3a [dynamo] Add immutable pytree to trace_rules (#156772)
Fixes https://github.com/pytorch/pytorch/issues/155426

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156772
Approved by: https://github.com/williamwen42
2025-06-25 20:08:47 +00:00
640f5a7090 [dynamo] Support builtin bool on non-constant VTs (#155863)
In practice `bool(...)` is either constant folded by Dynamo or used for
branching (so most of its emulation logic lived in
`InstructionTranslator.generic_jump`.

This patch adds a dedicated `bool` hanlder (only for symbolic
bool/int/float for now), and fixes #136075.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155863
Approved by: https://github.com/williamwen42
2025-06-23 15:53:15 +00:00
463fe36532 fix error message on specialization with Dim.DYNAMIC (#155738)
Previously specialization error messages would render sources that were pretty far from source-code names. E.g., given args named `x, y, zs`, the source for `y.size()[0]` would be rendered as `args[0][1].size()[0]`.

This is because we created artificial local names following `(args, kwargs)` structure instead of reusing signatures. This PR fixes that situation.

Basically we map prefixes of key paths that correspond to original arg names to root sources corresponding to those names; the rest of the key paths hang from these root sources.

Differential Revision: [D76461391](https://our.internmc.facebook.com/intern/diff/D76461391/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155738
Approved by: https://github.com/bobrenjc93
2025-06-13 10:33:46 +00:00
68034198e5 [HOP] Mutation and alias rework (#146658)
This PR reworks the way the input mutations and various aliases are checked

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146658
Approved by: https://github.com/ydwu4
2025-05-18 08:05:22 +00:00
3fe42d4d5d [export] Dynamo symint support (#152677)
Basically adds native _IntWrapper support to dynamo. Here's my process of trying to make symint input support work on dynamo, and how I ended up with this approach [(doc)](https://docs.google.com/document/d/1GvNRQd8BnxlMay_hrEVgEta6VUeUW_hcFeRuB7q1nDY/edit?tab=t.0).

What I did was, before passing inputs to dynamo.export, I first wrap them with a class, `_IntWrapper`. When processing dynamic shapes, I will then add the corresponding dynamic shape specification to the `dynamism` field stored on the `_IntWrapper`. If there is no dynamism specified, then this will get unwrapped back to an integer. When dynamo tracing, when we encounter an `_IntWrapper`, we will convert this to a symint if the dynamism was specified as `Dim.DYNAMIC/AUTO`. Dynamo will then trace a graph that contains symint inputs, which will get passed to AOTAutograd and so on.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152677
Approved by: https://github.com/pianpwk
2025-05-16 07:51:50 +00:00
ceb009baee [map] always turn on dynamo for map (#152041)
Summary:
X-link: https://github.com/pytorch/executorch/pull/10409

Reland D72896450

Make map consistent with other control flow ops. After the change, map is able to support accessing closures in the map fn.

Test Plan: See existing tests.

Reviewed By: zou3519

Differential Revision: D73138427

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152041
Approved by: https://github.com/zou3519
2025-05-12 02:10:08 +00:00
e5ea7911ea [ez] Make relaxed constraint error message more user friendly (#151407)
Fixes #151356

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151407
Approved by: https://github.com/Skylion007
2025-04-30 03:51:50 +00:00
4504910843 Revert "[ez] Make relaxed constraint error message more user friendly (#151407)"
This reverts commit e0f05229e9ff84aa6138df2bd51f5044bc743afb.

Reverted https://github.com/pytorch/pytorch/pull/151407 on behalf of https://github.com/ZainRizvi due to Sorry but this is breaking internally (see D73198095). To validate your fixes internally, you can follow the instructions here: https://fburl.com/fixing-ghfirst-reverts. ([comment](https://github.com/pytorch/pytorch/pull/151407#issuecomment-2821819654))
2025-04-22 16:12:42 +00:00
e0f05229e9 [ez] Make relaxed constraint error message more user friendly (#151407)
Fixes #151356

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151407
Approved by: https://github.com/Skylion007
2025-04-17 06:43:10 +00:00
a582f04608 Revert "[ez] Make relaxed constraint error message more user friendly (#151407)"
This reverts commit bc934f57d7c14b07e7497eb72a90d893270bc662.

Reverted https://github.com/pytorch/pytorch/pull/151407 on behalf of https://github.com/izaitsevfb due to breaks export tests ([comment](https://github.com/pytorch/pytorch/pull/151407#issuecomment-2810716135))
2025-04-16 20:40:22 +00:00
bc934f57d7 [ez] Make relaxed constraint error message more user friendly (#151407)
Fixes #151356

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151407
Approved by: https://github.com/Skylion007
2025-04-16 17:00:06 +00:00
4a47dd9b3f Revert "[map] always turn on dynamo for map (#150962)"
This reverts commit a72d56cb6be8c6ded5678b0b98003c90fd1b5a71.

Reverted https://github.com/pytorch/pytorch/pull/150962 on behalf of https://github.com/Camyll due to breaking internal builds {SHORT_REASON} ([comment](https://github.com/pytorch/pytorch/pull/150962#issuecomment-2803006282))
2025-04-14 21:09:22 +00:00
a72d56cb6b [map] always turn on dynamo for map (#150962)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150962
Approved by: https://github.com/zou3519
2025-04-11 23:28:06 +00:00
f649ee73ce Use source hashing to generate consistent symbolic ids (#149665)
This PR was inspired by internal models that were cache missing due to PGO. At a high level the problem looks as follows

Run 1, Invocation 1: We do static compile, save some example values in PGO/automatic dynamic

Run 1, Invocation 2: We detect varying inputs, do dynamic compile, get a dynamic graph and save to PGO. Crucially what we save to PGO is actually a superset of what is actually dynamic. If we notice an input was varying, we mark it as dynamic in PGO even if later on that value gets specialized. When a value gets specialized, we actually remove the symbol from the graph. This results in an interesting conundrum where although we are producing the same isomorphic graph, PGO makes the second run cache miss. Let's see how....

Run 2, Invocation 1: We fetch the PGO, over-mark things as dynamic, get a fx graph, look it up in the cache and... whoops! cache miss! This is because of the aforementioned behavior where the PGO profile will cause us to over-allocate symbols. In practice this means we end up saving a graph in cache with symbols x:s1, y:s3 and on second attempt we cache miss with x:s1, y:s6 where symbols s3,s4,s5 were all optimistically marked dynamic by PGO and subsequently specialized.

We solve this problem by hashing the source names. This ensures somewhat stable assignment. To prevent catastrophic symbol collisions, we use linear probing to ensure no collisions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149665
Approved by: https://github.com/Mingming-Ding, https://github.com/laithsakka
2025-03-28 05:36:32 +00:00
af7719a2fa Revert "Use source hashing to generate consistent symbolic ids (#149665)"
This reverts commit 1f92348dc6c60e3020a723b37ecb8226cf2480c0.

Reverted https://github.com/pytorch/pytorch/pull/149665 on behalf of https://github.com/malfet due to Broke trunk, see 6eb3c2e282/1 ([comment](https://github.com/pytorch/pytorch/pull/149665#issuecomment-2758578187))
2025-03-27 16:02:27 +00:00
1f92348dc6 Use source hashing to generate consistent symbolic ids (#149665)
This PR was inspired by internal models that were cache missing due to PGO. At a high level the problem looks as follows

Run 1, Invocation 1: We do static compile, save some example values in PGO/automatic dynamic

Run 1, Invocation 2: We detect varying inputs, do dynamic compile, get a dynamic graph and save to PGO. Crucially what we save to PGO is actually a superset of what is actually dynamic. If we notice an input was varying, we mark it as dynamic in PGO even if later on that value gets specialized. When a value gets specialized, we actually remove the symbol from the graph. This results in an interesting conundrum where although we are producing the same isomorphic graph, PGO makes the second run cache miss. Let's see how....

Run 2, Invocation 1: We fetch the PGO, over-mark things as dynamic, get a fx graph, look it up in the cache and... whoops! cache miss! This is because of the aforementioned behavior where the PGO profile will cause us to over-allocate symbols. In practice this means we end up saving a graph in cache with symbols x:s1, y:s3 and on second attempt we cache miss with x:s1, y:s6 where symbols s3,s4,s5 were all optimistically marked dynamic by PGO and subsequently specialized.

We solve this problem by hashing the source names. This ensures somewhat stable assignment. To prevent catastrophic symbol collisions, we use linear probing to ensure no collisions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149665
Approved by: https://github.com/Mingming-Ding, https://github.com/laithsakka
2025-03-27 03:39:27 +00:00
0a0a73a9a9 [cond] don't trace fw and bw graph in autograd key (#148930)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/148930
Approved by: https://github.com/zou3519
2025-03-24 17:07:29 +00:00
34d726011f [dynamo] update data-dependent branching graph break messages (#147912)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147912
Approved by: https://github.com/jansel, https://github.com/zou3519
ghstack dependencies: #147494, #147872
2025-02-28 12:30:06 +00:00
4caeede799 [dynamo] more better error messages [3/N] (#147494)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147494
Approved by: https://github.com/jansel, https://github.com/zou3519
2025-02-28 06:23:28 +00:00
824474cb35 [cond] support output sizes mismatch in front end (#147130)
This PR finishes https://github.com/pytorch/pytorch/pull/137615 by addressing the TODOs and comments left there.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147130
Approved by: https://github.com/zou3519
2025-02-25 20:28:41 +00:00
8af31e30d7 [Codemod][AddExplicitStrictExportArg] caffe2/torch (#146439)
Differential Revision: D69068432

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146439
Approved by: https://github.com/avikchaudhuri
2025-02-05 22:56:54 +00:00
99dbc5b0e2 PEP585 update - test (#145176)
See #145101 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145176
Approved by: https://github.com/bobrenjc93
2025-01-22 04:48:28 +00:00
972d4a154d Add facility to run dynamo UTs for non-cuda devices (#140929)
This is in line with changes introduced with https://github.com/pytorch/pytorch/pull/130714, additional files are included to support non-cuda devices.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140929
Approved by: https://github.com/kwen2501, https://github.com/EikanWang, https://github.com/guangyey
2025-01-20 05:56:38 +00:00
cyy
df458be4e5 [4/N] Apply py39 ruff and pyupgrade fixes (#143257)
```torch/fx/passes/annotate_getitem_nodes.py``` was changed to support the new type hinting annotations.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143257
Approved by: https://github.com/justinchuby, https://github.com/albanD
2025-01-04 10:47:51 +00:00
5660709856 [hop][BE] unify meta checking with check_meta_consistency (#143545)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143545
Approved by: https://github.com/zou3519
ghstack dependencies: #143105
2025-01-03 19:01:07 +00:00
ba5cacbc17 [Codemod][AddExplicitStrictExportArg] caffe2/test (#143688)
Reviewed By: avikchaudhuri

Differential Revision: D67530154

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143688
Approved by: https://github.com/tugsbayasgalan
2024-12-27 07:58:44 +00:00
d25e6e623f Fix unused Python variables in test/[a-d]* (#134665)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134665
Approved by: https://github.com/albanD
2024-12-13 22:13:12 +00:00
b37cfddeb3 Refactor ShapeGuardPrinter for future C++ addiiton (#140968)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140968
Approved by: https://github.com/anijain2305
ghstack dependencies: #140597
2024-11-27 20:09:58 +00:00
44186a0a4e Move Sympy printers to torch/utils/_sympy/printers.py (#140597)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140597
Approved by: https://github.com/ezyang, https://github.com/anijain2305
2024-11-26 18:11:00 +00:00
f23621ec56 Revert "Move Sympy printers to torch/utils/_sympy/printers.py (#140597)"
This reverts commit c25b201583fc28243b87c460a2f18e2531a676e7.

Reverted https://github.com/pytorch/pytorch/pull/140597 on behalf of https://github.com/huydhn due to Trunk is sad again after this lands, this looks like a landrace this time, so please do a rebase ([comment](https://github.com/pytorch/pytorch/pull/140597#issuecomment-2494052978))
2024-11-22 15:43:39 +00:00
c25b201583 Move Sympy printers to torch/utils/_sympy/printers.py (#140597)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140597
Approved by: https://github.com/ezyang, https://github.com/anijain2305
2024-11-22 02:04:36 +00:00
225d3f4495 [subclasses] Subclass parameterization to not wrap-unwrap on forward (#140632)
One of the common use cases of tensor Subclasses is to replace all model Parameters with Subclass that provides alternative implementation of common ops. E.g. quantization replaces weights to QuantizedSubclass.

AotAutograd lifts up Parameters to graph arguments and wraps graph execution at runtime with wrapping/unwrapping of those subclasses.

Even if one unwrapping is not critically big ~14us, when we have to unwrap/wrap all linear weights, that could  result in substantial addition to runtime, which can be more than compiled region execution time. E.g. 20 weights * 14us = 0.3ms.

This is parametrization to unwrap tensor subclasses, that is used in torch.ao: https://github.com/pytorch/ao/blob/main/torchao/utils.py#L294

It adds parametrization to unwrap tensor subclasses to plain tensors.
As a result the registered parameters are changed (all registered parameters will become plain tensors) and  state_dict is not compatible before/after transformation.

This transformation is used before dynamo and does breaking changes, so we keep it for user to be used explicitly.

Testing:
```
TORCH_LOGS="graph_code,aot" python test/functorch/test_aotdispatch.py -k test_subclass_parameters
```
```
TORCH_LOGS="graph_code,aot,export" python test/dynamo/test_export.py -k test_subclass_parameters
```

```
TRACED GRAPH
  ===== pre insert_deferred_runtime_asserts __compiled_fn_1 =====
  <eval_with_key>.0 class GraphModule(torch.nn.Module):
     def forward(self, L_self_modules_parametrizations_modules_p1_parameters_original0_: "f32[3, 4]", L_x_: "f32[3, 4]", L_self_modules_parametrizations_modules_p2_parameters_original0_: "f32[3, 4]", L_self_modules_parametrizations_modules_p2_parameters_original1_: "f32[3, 4]"):
         l_self_modules_parametrizations_modules_p1_parameters_original0_ = L_self_modules_parametrizations_modules_p1_parameters_original0_
         l_x_ = L_x_
         l_self_modules_parametrizations_modules_p2_parameters_original0_ = L_self_modules_parametrizations_modules_p2_parameters_original0_
         l_self_modules_parametrizations_modules_p2_parameters_original1_ = L_self_modules_parametrizations_modules_p2_parameters_original1_

          # File: /data/users/ivankobzarev/a/pytorch/torch/testing/_internal/subclasses.py:42 in __tensor_unflatten__, code: return WrapperSubclass(a, outer_size, outer_stride)
         rebuilt: "f32[3, 4]" = torch.testing._internal.subclasses.WrapperSubclass(l_self_modules_parametrizations_modules_p1_parameters_original0_, None, None);  l_self_modules_parametrizations_modules_p1_parameters_original0_ = None

          # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6301 in forward, code: return x + 2 * self.p1 + self.p2
         mul: "f32[3, 4]" = 2 * rebuilt;  rebuilt = None
         add: "f32[3, 4]" = l_x_ + mul;  l_x_ = mul = None

          # File: /data/users/ivankobzarev/a/pytorch/torch/testing/_internal/two_tensor.py:58 in __tensor_unflatten__, code: return TwoTensor(a, b, outer_size, outer_stride)
         rebuilt_1: "f32[3, 4]" = torch.testing._internal.two_tensor.TwoTensor(l_self_modules_parametrizations_modules_p2_parameters_original0_, l_self_modules_parametrizations_modules_p2_parameters_original1_, None, None);  l_self_modules_parametrizations_modules_p2_parameters_original0_ = l_self_modules_parametrizations_modules_p2_parameters_original1_ = None

          # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6301 in forward, code: return x + 2 * self.p1 + self.p2
         add_1: "f32[3, 4]" = add + rebuilt_1;  add = rebuilt_1 = None
         return (add_1,)

ACED GRAPH
==== __compiled_fn_1 =====
data/users/ivankobzarev/a/pytorch/torch/fx/_lazy_graph_module.py class GraphModule(torch.nn.Module):
  def forward(self, L_self_modules_parametrizations_modules_p1_parameters_original0_: "f32[3, 4][4, 1]cpu", L_x_: "f32[3, 4][4, 1]cpu", L_self_modules_parametrizations_modules_p2_parameters_original0_: "f32[3, 4][4, 1]cpu", L_self_modules_parametrizations_modules_p2_parameters_original1_: "f32[3, 4][4, 1]cpu"):
      l_self_modules_parametrizations_modules_p1_parameters_original0_ = L_self_modules_parametrizations_modules_p1_parameters_original0_
      l_x_ = L_x_
      l_self_modules_parametrizations_modules_p2_parameters_original0_ = L_self_modules_parametrizations_modules_p2_parameters_original0_
      l_self_modules_parametrizations_modules_p2_parameters_original1_ = L_self_modules_parametrizations_modules_p2_parameters_original1_

       # File: /data/users/ivankobzarev/a/pytorch/torch/testing/_internal/subclasses.py:42 in __tensor_unflatten__, code: return WrapperSubclass(a, outer_size, outer_stride)
      rebuilt: "f32[3, 4][4, 1]cpu" = torch.testing._internal.subclasses.WrapperSubclass(l_self_modules_parametrizations_modules_p1_parameters_original0_, None, None);  l_self_modules_parametrizations_modules_p1_parameters_original0_ = None

       # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6301 in forward, code: return x + 2 * self.p1 + self.p2
      mul: "f32[3, 4][4, 1]cpu" = 2 * rebuilt;  rebuilt = None
      add: "f32[3, 4][4, 1]cpu" = l_x_ + mul;  l_x_ = mul = None

       # File: /data/users/ivankobzarev/a/pytorch/torch/testing/_internal/two_tensor.py:58 in __tensor_unflatten__, code: return TwoTensor(a, b, outer_size, outer_stride)
      rebuilt_1: "f32[3, 4][4, 1]cpu" = torch.testing._internal.two_tensor.TwoTensor(l_self_modules_parametrizations_modules_p2_parameters_original0_, l_self_modules_parametrizations_modules_p2_parameters_original1_, None, None);  l_self_modules_parametrizations_modules_p2_parameters_original0_ = l_self_modules_parametrizations_modules_p2_parameters_original1_ = None

       # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6301 in forward, code: return x + 2 * self.p1 + self.p2
      add_1: "f32[3, 4][4, 1]cpu" = add + rebuilt_1;  add = rebuilt_1 = None
      return (add_1,)

.py:381] [0/0] [__aot_joint_graph] TRACED GRAPH
.py:381] [0/0] [__aot_joint_graph]  ===== Joint graph 0 =====
.py:381] [0/0] [__aot_joint_graph]  /data/users/ivankobzarev/a/pytorch/torch/fx/_lazy_graph_module.py class joint_fn(torch.nn.Module):
.py:381] [0/0] [__aot_joint_graph]     def forward(self, primals, tangents):
.py:381] [0/0] [__aot_joint_graph]         primals_1: "f32[3, 4][4, 1]cpu"; primals_2: "f32[3, 4][4, 1]cpu"; primals_3: "f32[3, 4][4, 1]cpu"; primals_4: "f32[3, 4][4, 1]cpu"; tangents_1: "f32[3, 4][4, 1]cpu"; tangents_2: "f32[3, 4][4, 1]cpu";
.py:381] [0/0] [__aot_joint_graph]
.py:381] [0/0] [__aot_joint_graph]         primals_1, primals_2, primals_3, primals_4, tangents_1, tangents_2, = fx_pytree.tree_flatten_spec([primals, tangents], self._in_spec)
.py:381] [0/0] [__aot_joint_graph]          # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6301 in forward, code: return x + 2 * self.p1 + self.p2
.py:381] [0/0] [__aot_joint_graph]         mul: "f32[3, 4][4, 1]cpu" = torch.ops.aten.mul.Tensor(primals_1, 2);  primals_1 = None
.py:381] [0/0] [__aot_joint_graph]         add: "f32[3, 4][4, 1]cpu" = torch.ops.aten.add.Tensor(primals_2, mul);  primals_2 = mul = None
.py:381] [0/0] [__aot_joint_graph]         add_1: "f32[3, 4][4, 1]cpu" = torch.ops.aten.add.Tensor(add, primals_3);  primals_3 = None
.py:381] [0/0] [__aot_joint_graph]         add_2: "f32[3, 4][4, 1]cpu" = torch.ops.aten.add.Tensor(add, primals_4);  add = primals_4 = None
.py:381] [0/0] [__aot_joint_graph]         return pytree.tree_unflatten([add_1, add_2, None, None, None, None], self._out_spec)
.py:381] [0/0] [__aot_joint_graph]
.py:381] [0/0] [__aot_joint_graph]
graph_code] TRACED GRAPH
graph_code]  ===== tensorify_python_scalars =====
graph_code]  /data/users/ivankobzarev/a/pytorch/torch/fx/_lazy_graph_module.py class joint_fn(torch.nn.Module):
graph_code]     def forward(self, primals, tangents):
graph_code]         primals_1: "f32[3, 4]"; primals_2: "f32[3, 4]"; primals_3: "f32[3, 4]"; primals_4: "f32[3, 4]"; tangents_1: "f32[3, 4]"; tangents_2: "f32[3, 4]";
graph_code]
graph_code]         primals_1, primals_2, primals_3, primals_4, tangents_1, tangents_2, = fx_pytree.tree_flatten_spec([primals, tangents], self._in_spec)
graph_code]          # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6301 in forward, code: return x + 2 * self.p1 + self.p2
graph_code]         mul: "f32[3, 4]" = torch.ops.aten.mul.Tensor(primals_1, 2);  primals_1 = None
graph_code]         add: "f32[3, 4]" = torch.ops.aten.add.Tensor(primals_2, mul);  primals_2 = mul = None
graph_code]         add_1: "f32[3, 4]" = torch.ops.aten.add.Tensor(add, primals_3);  primals_3 = None
graph_code]         add_2: "f32[3, 4]" = torch.ops.aten.add.Tensor(add, primals_4);  add = primals_4 = None
graph_code]         return pytree.tree_unflatten([add_1, add_2, None, None, None, None], self._out_spec)
graph_code]
graph_code]
.py:463] [0/0] [__aot_graphs] aot_config id: 0, fw_metadata=ViewAndMutationMeta(input_info=[InputAliasInfo(is_leaf=True, mutates_data=False, mutates_metadata=False, mutations_hidden_from_autograd=True, mutations_under_no_grad_or_inference_mode=False, mutation_inductor_storage_resize=False, mutates_storage_metadata=False, requires_grad=True, keep_input_mutations=True), InputAliasInfo(is_leaf=True, mutates_data=False, mutates_metadata=False, mutations_hidden_from_autograd=True, mutations_under_no_grad_or_inference_mode=False, mutation_inductor_storage_resize=False, mutates_storage_metadata=False, requires_grad=False, keep_input_mutations=True), InputAliasInfo(is_leaf=True, mutates_data=False, mutates_metadata=False, mutations_hidden_from_autograd=True, mutations_under_no_grad_or_inference_mode=False, mutation_inductor_storage_resize=False, mutates_storage_metadata=False, requires_grad=True, keep_input_mutations=True), InputAliasInfo(is_leaf=True, mutates_data=False, mutates_metadata=False, mutations_hidden_from_autograd=True, mutations_under_no_grad_or_inference_mode=False, mutation_inductor_storage_resize=False, mutates_storage_metadata=False, requires_grad=True, keep_input_mutations=True)], output_info=[OutputAliasInfo(output_type=<OutputType.non_alias: 1>, raw_type=<class 'torch.testing._internal.subclasses.WrapperSubclass'>, base_idx=None, dynamic_dims=set(), requires_grad=True, functional_tensor=None)], num_intermediate_bases=0, keep_input_mutations=True, traced_tangents=[WrapperSubclass(TwoTensor(FakeTensor(..., size=(3, 4)), FakeTensor(..., size=(3, 4))))], subclass_inp_meta=[PlainTensorMeta(unwrapped_idx=0, memory_format=None), PlainTensorMeta(unwrapped_idx=1, memory_format=None), PlainTensorMeta(unwrapped_idx=2, memory_format=None), PlainTensorMeta(unwrapped_idx=3, memory_format=None)], subclass_fw_graph_out_meta=[SubclassCreationMeta(flat_tensor_start_idx=0, arg_count=2, included_subclass_symints=True, attrs={'a': SubclassCreationMeta(flat_tensor_start_idx=0, arg_count=2, included_subclass_symints=True, attrs={'a': PlainTensorMeta(unwrapped_idx=1, memory_format=None), 'b': PlainTensorMeta(unwrapped_idx=2, memory_format=None)}, outer_size=torch.Size([3, 4]), outer_stride=(4, 1), meta=None, original_subclass=TwoTensor(FakeTensor(..., size=(3, 4)), FakeTensor(..., size=(3, 4))), original_subclass_type=None, memory_format=None)}, outer_size=torch.Size([3, 4]), outer_stride=(4, 1), meta=None, original_subclass=WrapperSubclass(TwoTensor(FakeTensor(..., size=(3, 4)), FakeTensor(..., size=(3, 4)))), original_subclass_type=None, memory_format=None)], subclass_tangent_meta=[SubclassCreationMeta(flat_tensor_start_idx=0, arg_count=2, included_subclass_symints=False, attrs={'a': SubclassCreationMeta(flat_tensor_start_idx=0, arg_count=2, included_subclass_symints=False, attrs={'a': PlainTensorMeta(unwrapped_idx=1, memory_format=torch.contiguous_format), 'b': PlainTensorMeta(unwrapped_idx=2, memory_format=torch.contiguous_format)}, outer_size=torch.Size([3, 4]), outer_stride=(4, 1), meta=None, original_subclass=TwoTensor(FakeTensor(..., size=(3, 4)), FakeTensor(..., size=(3, 4))), original_subclass_type=None, memory_format=torch.contiguous_format)}, outer_size=torch.Size([3, 4]), outer_stride=(4, 1), meta=None, original_subclass=WrapperSubclass(TwoTensor(FakeTensor(..., size=(3, 4)), FakeTensor(..., size=(3, 4)))), original_subclass_type=None, memory_format=torch.contiguous_format)], is_train=True, traced_tangent_metas=None, num_symints_saved_for_bw=0, grad_enabled_mutation=None, deterministic=False, static_input_indices=[0, 2, 3], tokens={}, indices_of_inputs_that_requires_grad_with_mutations_in_bw=[], bw_donated_idxs=[], num_backward_tokens=0), inner_meta=ViewAndMutationMeta(input_info=[InputAliasInfo(is_leaf=True, mutates_data=False, mutates_metadata=False, mutations_hidden_from_autograd=True, mutations_under_no_grad_or_inference_mode=False, mutation_inductor_storage_resize=False, mutates_storage_metadata=False, requires_grad=True, keep_input_mutations=True), InputAliasInfo(is_leaf=True, mutates_data=False, mutates_metadata=False, mutations_hidden_from_autograd=True, mutations_under_no_grad_or_inference_mode=False, mutation_inductor_storage_resize=False, mutates_storage_metadata=False, requires_grad=False, keep_input_mutations=True), InputAliasInfo(is_leaf=True, mutates_data=False, mutates_metadata=False, mutations_hidden_from_autograd=True, mutations_under_no_grad_or_inference_mode=False, mutation_inductor_storage_resize=False, mutates_storage_metadata=False, requires_grad=True, keep_input_mutations=True), InputAliasInfo(is_leaf=True, mutates_data=False, mutates_metadata=False, mutations_hidden_from_autograd=True, mutations_under_no_grad_or_inference_mode=False, mutation_inductor_storage_resize=False, mutates_storage_metadata=False, requires_grad=True, keep_input_mutations=True)], output_info=[OutputAliasInfo(output_type=<OutputType.non_alias: 1>, raw_type=<class 'torch._subclasses.functional_tensor.FunctionalTensor'>, base_idx=None, dynamic_dims=set(), requires_grad=False, functional_tensor=None), OutputAliasInfo(output_type=<OutputType.non_alias: 1>, raw_type=<class 'torch._subclasses.functional_tensor.FunctionalTensor'>, base_idx=None, dynamic_dims=set(), requires_grad=False, functional_tensor=None)], num_intermediate_bases=0, keep_input_mutations=True, traced_tangents=[], subclass_inp_meta=[PlainTensorMeta(unwrapped_idx=0, memory_format=None), PlainTensorMeta(unwrapped_idx=1, memory_format=None), PlainTensorMeta(unwrapped_idx=2, memory_format=None), PlainTensorMeta(unwrapped_idx=3, memory_format=None)], subclass_fw_graph_out_meta=[PlainTensorMeta(unwrapped_idx=0, memory_format=None), PlainTensorMeta(unwrapped_idx=1, memory_format=None)], subclass_tangent_meta=[], is_train=True, traced_tangent_metas=None, num_symints_saved_for_bw=0, grad_enabled_mutation=None, deterministic=None, static_input_indices=[0], tokens={}, indices_of_inputs_that_requires_grad_with_mutations_in_bw=[], bw_donated_idxs=[], num_backward_tokens=0)
.py:569] [0/0] [__aot_graphs] TRACED GRAPH
.py:569] [0/0] [__aot_graphs]  ===== Forward graph 0 =====
.py:569] [0/0] [__aot_graphs]  /data/users/ivankobzarev/a/pytorch/torch/fx/_lazy_graph_module.py class GraphModule(torch.nn.Module):
.py:569] [0/0] [__aot_graphs]     def forward(self, primals_1: "f32[3, 4][4, 1]cpu", primals_2: "f32[3, 4][4, 1]cpu", primals_3: "f32[3, 4][4, 1]cpu", primals_4: "f32[3, 4][4, 1]cpu"):
.py:569] [0/0] [__aot_graphs]          # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6301 in forward, code: return x + 2 * self.p1 + self.p2
.py:569] [0/0] [__aot_graphs]         mul: "f32[3, 4][4, 1]cpu" = torch.ops.aten.mul.Tensor(primals_1, 2);  primals_1 = None
.py:569] [0/0] [__aot_graphs]         add: "f32[3, 4][4, 1]cpu" = torch.ops.aten.add.Tensor(primals_2, mul);  primals_2 = mul = None
.py:569] [0/0] [__aot_graphs]         add_1: "f32[3, 4][4, 1]cpu" = torch.ops.aten.add.Tensor(add, primals_3);  primals_3 = None
.py:569] [0/0] [__aot_graphs]         add_2: "f32[3, 4][4, 1]cpu" = torch.ops.aten.add.Tensor(add, primals_4);  add = primals_4 = None
.py:569] [0/0] [__aot_graphs]         return (add_1, add_2)
.py:569] [0/0] [__aot_graphs]
.py:569] [0/0] [__aot_graphs]
.py:580] [0/0] [__aot_graphs] TRACED GRAPH
.py:580] [0/0] [__aot_graphs]  ===== Backward graph 0 =====
.py:580] [0/0] [__aot_graphs]  /data/users/ivankobzarev/a/pytorch/torch/fx/_lazy_graph_module.py class GraphModule(torch.nn.Module):
.py:580] [0/0] [__aot_graphs]     def forward(self, tangents_1: "f32[3, 4][4, 1]cpu", tangents_2: "f32[3, 4][4, 1]cpu"):
.py:580] [0/0] [__aot_graphs]         return (None, None, None, None)
.py:580] [0/0] [__aot_graphs]
.py:580] [0/0] [__aot_graphs]
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140632
Approved by: https://github.com/bdhirsh
2024-11-21 01:09:33 +00:00
701e06b643 Revert "Move Sympy printers to torch/utils/_sympy/printers.py (#140597)"
This reverts commit aefcdb3c9fa787f9d43864f6f99a3590c914324a.

Reverted https://github.com/pytorch/pytorch/pull/140597 on behalf of https://github.com/huydhn due to Sorry for reverting your change but I think it fails inductor/test_padding in trunk. This is a target determination miss and that failed test was not run in your PR ([comment](https://github.com/pytorch/pytorch/pull/140597#issuecomment-2489641453))
2024-11-20 22:13:57 +00:00
aefcdb3c9f Move Sympy printers to torch/utils/_sympy/printers.py (#140597)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140597
Approved by: https://github.com/ezyang, https://github.com/anijain2305
2024-11-20 20:26:49 +00:00
a1327fac45 [Dynamo] Replace torch._dynamo.optimize() with torch.compile() [5/N] (#140663)
related commits:

- #139706
- #140238
- #140247
- #140253
- #140663
- #140688

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140663
Approved by: https://github.com/williamwen42
2024-11-18 04:11:56 +00:00
1a2dc89f17 [Dynamo] Allow torch.cond() to handle emply arguments (#138190)
Fixes #138150

```python
import torch

@torch.compile(fullgraph=True)
def foo(x, y, z):
    def f():
        return y + 2

    def g():
        return z + 1

    return torch.cond(x, f, g)

print(foo(torch.zeros(1), torch.ones(1), torch.ones(1))) # tensor([2.])
print(foo(torch.ones(1), torch.ones(1), torch.ones(1))) # tensor([3.])
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138190
Approved by: https://github.com/ezyang, https://github.com/zou3519
2024-10-26 15:26:21 +00:00
3b2b5486ea Fixes issue with torch._dynamo.assume_constant_result with global functions (#132431)
This PR fixes an issue with `torch._dynamo.assume_constant_result` causing global values to be overwritten.
Currently `torch._dynamo.assume_constant_result` saves the constant result into a global variable derived from the name of the function.  This causes that function to be overwritten in the global scope.  This PR checks that the name is unique in the global scope as well, avoiding the issue of overriding the function.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132431
Approved by: https://github.com/jansel
2024-10-22 21:14:26 +00:00
a7f49de485 Fixes issue with enums in a tuple for dynamo (#133123)
Currently when tuples values are encountered in dynamo, they are encoded using `repr(arg)`.  This causes an issue if one of the values inside of the tuple will not be properly encoded.  In this case, if an enum is contained inside of a tuple, it will cause invalid python code to be generated

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133123
Approved by: https://github.com/jansel
2024-10-21 23:45:11 +00:00
080f02ac7a [dynamo] do not raise an unimplemented error with boolean masking setitem (#134902)
Cudagraph breaks on boolean masking setitem, however the code runs fine. There is no need to raise an unimplemented error here, since it already warns that its an incompatible op.

Fixes #134241

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134902
Approved by: https://github.com/jansel, https://github.com/henrylhtsang
2024-10-10 19:11:40 +00:00
b897ab0540 [export] ignore mark_dynamic() in export (#135536)
Previously we were accomodating `torch._dynamo.mark_dynamic()` for export's dynamic shapes. Here we clean things up and ignore it, requiring users to specify an export input for `dynamic_shapes`.

Note: there's 4 decorators relevant to export, `mark_dynamic, maybe_mark_dynamic, mark_static, mark_unbacked`. User calls that involve export have only been `mark_dynamic()`, and we use `maybe_mark_dynamic` under the hood for `Dim.AUTO`, but we could start using others. One reason I decided to not warn and just silently ignore is these decorators cause the tensors to carry dynamic info, and it'll be hard to tell whether the markers are from export or user calls when re-exporting with the same inputs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135536
Approved by: https://github.com/avikchaudhuri
2024-09-12 21:22:19 +00:00
8ff3a5be1b [export] basic auto dynamic shapes (#133620)
Starter version of automatic dynamic shapes for export.

Creates enums `DIM.AUTO`, `DIM.STATIC`, allowing user to specify `AUTO` for dims in dynamic_shapes specs, meaning that corresponding dims are treated as dynamic, and relevant guards will do what's necessary (e.g. refine ValueRanges, set replacements based on equality, or even set static) without raising ConstraintViolationErrors. Basically allows the user to say, "a bunch of these dims can be dynamic, let export do model analysis and return the program with maximum possible dynamism, without complaining".

The usage for specifying `dynamic_shapes` is now:
```
AUTO -> dynamic by default, return whatever produce_guards() says, even if it's static
None/int/STATIC -> static
Dim/DerivedDim -> same as before - will complain if the min/max range is invalid, or if dims related to this are unspecified.
```

Caveat 1: specifying `AUTO` for a dim won't guarantee it'll be dynamic:

- specifying `AUTO` for a dim will return the maximum possible dynamism given your program and other specified constraints, but this can still mean you'll get a static program. For example, with the program below, x is specified dynamic, but it's equal to y, which is specified static, and with how we currently do things we won't promote y to dynamic, but will demote(?) x to static. So this can be surprising if you don't fully know your model, and/or missed one of your other inputs when specifying auto-dynamic shapes.
```
class Foo(torch.nn.Module):
    def forward(self, x, y):
        return x + y
inputs = (torch.randn(6), torch.randn(6))
export(Foo(), inputs, dynamic_shapes={"x": (DIM.AUTO,), "y": None})
```

Caveat 2: specifying `AUTO` and Dims in the same spec is still problematic:

- The way Dims/DerivedDims are currently handled is very strict. A Dim represents a symbol, and we require a user to specify the symbol for all dims governed by the symbol - that's why we've seen errors in the past like `The values of x must always be related to y by ...`, asking the user to specify the exact relation as in the program. We also require the specified min/max range to be a subset of the valid range from model analysis. All this doesn't compose well with specifying `AUTO` just yet - for example in the program below, ideal behavior could be to return a dynamic program, where `dx = x.size(0) = y.size(0)` has range (3,6). Unfortunately this crashes, and correct behavior is to specify `dx` for both inputs. So currently we raise a UserError and crash if both Dims + `AUTO` are present in the spec.
```
class Foo(torch.nn.Module):
    def forward(self, x, y):
        return x + y
inputs = (torch.randn(6), torch.randn(6))
export(Foo(), inputs, dynamic_shapes={"x": (DIM.AUTO,), "y": {0: Dim("dx", min=3, max=6)}})  # this doesn't work, because x & y and related
```

Implementation details:

This is done by setting `assume_static_by_default=False`, and doing a transform on the `dynamic_shapes` spec to preserve semantics. `assume_static_by_default=False` will treat unspecified dims or Nones as dynamic. This is the opposite of what `export.export()` currently does - unspecified Dims/Nones are treated as static. Historically this static-by-default behavior, where the user deals with fewer guards, has been desirable, and we would like to respect that in this implementation. So this internal spec transformation is added, `_transform_shapes_for_default_dynamic()`, does the spec conversion necessary to be compatbile with dynamic by default. Specifically, AUTOs are converted into Nones, and Nones/unspecified dims are filled in with explicitly static constraints.

For example, this would look like, for a 3-d tensor: `{0: DIM.AUTO, 1: None, 2: Dim("dx")} -> {0: None, 1: 32, 2: Dim("dx")}`

This does seem overly complicated, but it's done to preserve dynamic shapes semantics for `torch._dynamo.export()`, which already uses `assume_static_by_default=False`, and follows the same process for generating shape constraints , via `_process_dynamic_shapes`. There the semantics are:
```
None/unspecified: dynamic by default
Dim/DerivedDim: also a strict assertion
```

If we don't care about BC for `_dynamo.export(dynamic_shapes)`, then we can just modify semantics for `_process_dynamic_shapes()` and change all the relevant tests in `test/dynamo/test_export.py`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133620
Approved by: https://github.com/avikchaudhuri
2024-08-23 22:56:39 +00:00
b454c51060 remove dynamic_dim (#134211)
Summary: As promised in https://github.com/pytorch/pytorch/pull/134045.

Test Plan: existing

Differential Revision: D61646937

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134211
Approved by: https://github.com/angelayi
2024-08-23 04:13:03 +00:00
0d7ac1966a kill sharing of constraints (#134045)
Summary:
Previously, reuse of the same `Dim` was encoded by "sharing" internal constraints among constraint targets. This kind of sharing, implemented using `shared` fields between `_Constraint`s, was originally motivated by `dynamic_dim`, specifically to support `==` between `dynamic_dim`s, but we no longer need to maintain this overcomplicated structure: we can simply use names of `Dims` to directly encode sharing information.

Thus this PR vastly simplifies the structure of `_Constraint` by removing `shared` fields. As a result, both `_Constraint` and its moral subclass, `_DerivedConstraint`, are 1-1 with `Dim` and its moral subclass, `DerivedDim`.

Note that this will break `==` over `dynamic_dim`, so an immediate follow-up will be to remove `dynamic_dim` entirely from our public API. (It's been more than 6 months since the deprecation warning anyway.) I just didn't want to deal with that process in the same PR.

Test Plan: existing

Differential Revision: D61559413

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134045
Approved by: https://github.com/pianpwk
2024-08-22 04:40:47 +00:00
0d4eacb9d2 [fake tensor] unbacked symint support for binary op fast path (#133584)
Addreses https://github.com/pytorch/pytorch/issues/133525

We have an unbacked symint in `final_shape` and it's a tuple... So, add `guard_size_oblivious` to do size oblivious checks + `sym_eq` for list equality.

```
op.shape
> torch.Size([1])
final_shape
> (u0 + 1,)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133584
Approved by: https://github.com/ezyang
2024-08-19 20:03:05 +00:00
a75248528f [export] refactor _process_dynamic_shapes (#133391)
Sorryyyyy for another refactor. This splits `_process_dynamic_shapes` into 3 parts:
1. `_combine_args` - mostly the same thing
2. `_check_dynamic_shapes`, which is responsible for raising 99% of UserErrors if the dynamic shapes spec is invalid (minus 1 UserError with DerivedDims)
3.  `_process_dynamic_shapes`, which for now, is the same thing, minus the stuff in 2.

This refactor is helpful for incoming automatic dynamic shapes work, because, we're switching to `assume_static_by_default=False`, which is what `_dynamo.export` currently does. This means any unspecified dims are allocated a symbol, in contrast to export today which keeps unspecified dims static. Historically this has been desirable - export users don't want too much dynamism. So we want to change how the spec is translated into constraints.

This means when we switch over to automatic dynamic shapes, we want to plug in something in between steps 2. and 3. which patches up the spec for `assume_static_by_default=False`, filling in static shapes for any unspecified dims, and potentially clearing out the auto-dynamic dims (since they're no-ops). We would do this in-between 2. and 3. to keep `_process_dynamic_shapes` semantically the same, since it's used with `_dynamo.export`.

We could do this without a refactor, plugging in this transform before `_process_dynamic_shapes`, but since that function's responsible for both spec checking + constraint production, moving spec checking to before we transform the specs helps guarantee we're raising errors on what the user's specified, and not an internal export bug.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133391
Approved by: https://github.com/avikchaudhuri
2024-08-15 16:21:21 +00:00
aec6332356 Only thunkify proxies in some situations (#132421)
The goal of this PR is to avoid stack overflow when we create extremely long chains of thunks, and then evaluate them (e.g., as occurs if you sum(long list of symint)). The basic idea behind this PR is to only thunkify proxies if they're being created in places where they may or may not be used--crucially, symint operations that occur in user code we are tracing are eagerly placed into the graph, even if they may eventually be dead.

I annotated the PR with explanation of changes.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132421
Approved by: https://github.com/Skylion007, https://github.com/zou3519
ghstack dependencies: #132674, #132675
2024-08-08 12:03:06 +00:00
780310fed7 Revert "Only thunkify proxies in some situations (#132421)"
This reverts commit bb99008c9e7c357b88047bcd6971dc2078341484.

Reverted https://github.com/pytorch/pytorch/pull/132421 on behalf of https://github.com/clee2000 due to I think this broke dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_input [GH job link](https://github.com/pytorch/pytorch/actions/runs/10283744685/job/28459340678) [HUD commit link](bb99008c9e).  Test got added in f50621989b which is before your merge base ([comment](https://github.com/pytorch/pytorch/pull/132421#issuecomment-2273742960))
2024-08-07 15:29:54 +00:00
bb99008c9e Only thunkify proxies in some situations (#132421)
The goal of this PR is to avoid stack overflow when we create extremely long chains of thunks, and then evaluate them (e.g., as occurs if you sum(long list of symint)). The basic idea behind this PR is to only thunkify proxies if they're being created in places where they may or may not be used--crucially, symint operations that occur in user code we are tracing are eagerly placed into the graph, even if they may eventually be dead.

I annotated the PR with explanation of changes.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132421
Approved by: https://github.com/Skylion007, https://github.com/zou3519
ghstack dependencies: #132674, #132675
2024-08-07 11:51:17 +00:00