Builds on top of https://github.com/pytorch/pytorch/pull/163673 and https://github.com/pytorch/pytorch/pull/164174. This will be used in the followup PRs to apply regional inductor compilation.
The existing implementation let Dynamo trace into the `torch.fx.traceback.annotate`, but thats not what we want. We want Dynamo to essentially run the torch.fx.traceback.annotate function in eager, so that every Fx node created in Dynamo Fx graph has the custom meta node.
What does not work?
* We still have to set the context manager `torch.fx.traceback.preserve_node_meta()` in the user code because CI was unhappy. This can be fixed but with some perseverance.
* This does not work with graph breaks yet. But we can solve that problem, if needed, in a separate PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164678
Approved by: https://github.com/SherlockNoMad, https://github.com/jansel, https://github.com/xmfan
Actually we would like to not graph break even in the case of Dynamo. But there is a weird-unsolved bug with Kineto + Dynamo when there are distributed jobs that lead to NCCL timeouts. This bug is a rare edege case, but we have not been able to root cause it yet.
But for export, we do not anticipate JIT tracing in distributed job training and therefore this PR is safe for export.
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164418
Approved by: https://github.com/StrongerXi, https://github.com/williamwen42
As title. This is a follow-up of the previous patch, with the goal of
supporting a new pattern that showed up in ComfyUI:
644b23ac0b/comfy/ops.py (L44)
Effectively, the semantics of calling a function decorated with a
context manager is:
```python
@ctx_manager(args)
def f(x):
...
f(x)
# ----->
with ctx_manager(args):
f.__wrapped__(x)
```
Yes, a fresh context manager instance per invokation, see CPython source code:
https://github.com/python/cpython/blob/3.12/Lib/contextlib.py#L119-L122
So Dynamo already
1. knows how to handle the `with ctx_manager(args)` syntax, and has
special handling for a few torch native context managers, like
`sdpa_kernel` in this patch.
2. can trace through a good chunk (at least the ones that matter in this
case) of contextlib.
This patch just let Dynamo trace a bit more into contextlib, and then
keep the torch-native special cases by moving their handling a bit down
the stack, so that no additional logic is introduced -- it's only
refactored.
This also allows us to get rid of some `_sdpa_kernel_variadic` special
handling, since now we will trace through its code, and it boils down to
`sdpa_kernel` anyways.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/160703
Approved by: https://github.com/guilhermeleobas, https://github.com/mlazos
ghstack dependencies: #160684
This patch fixes 2 issues, illustrated by the test cases added:
1. using `sdpa_kernel(backends=..., set_priority=...)` due to an
internal assert that forgot to be updated after #147768.
2. forgetting to convert the `set_priority` VariableTracker back to a
python constant so that its value is properly used by `sdpa_kernel`,
also from #147768.
I ran into (1) because ComfyUI had a recent update that actually sues
this pattern
644b23ac0b/comfy/ops.py (L44),
and then noticed (2), and fixed it conveniently.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/160684
Approved by: https://github.com/mlazos
Fixes#157452
Test with
```
python test/dynamo/test_repros.py ReproTests.test_nn_parameter_ctor_graph_breaks
```
### Release Notes
Change to nn.Parameter Constructor Behavior in Dynamo
Semantic change introduced in the nn.Parameter constructor; previously, if the constructor lacked a clean source, the system would attempt to infer arguments to construct a clone and lift this synthetic proxy in the computation graph. This approach had many potential edge cases and was difficult to reason about. The new behavior defaults to graph breaking when the nn.Parameter constructor does not have a clean source. Users are now suggested to manually move the constructor out of the graph in such cases. This change improves clarity and reduces complexity in graph construction and debugging. Users can escape hatch to old semantics with `torch.dynamo.config.graph_break_on_nn_param_ctor=False` if this cannot be done.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158800
Approved by: https://github.com/anijain2305
Fixes#157452
Test with
```
python test/dynamo/test_repros.py ReproTests.test_nn_parameter_ctor_graph_breaks
```
### Release Notes
Change to nn.Parameter Constructor Behavior in Dynamo
Semantic change introduced in the nn.Parameter constructor; previously, if the constructor lacked a clean source, the system would attempt to infer arguments to construct a clone and lift this synthetic proxy in the computation graph. This approach had many potential edge cases and was difficult to reason about. The new behavior defaults to graph breaking when the nn.Parameter constructor does not have a clean source. Users are now suggested to manually move the constructor out of the graph in such cases. This change improves clarity and reduces complexity in graph construction and debugging. Users can escape hatch to old semantics with `torch.dynamo.config.graph_break_on_nn_param_ctor=False` if this cannot be done.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158800
Approved by: https://github.com/anijain2305
This PR fixes a minor typo in a comment in `torch/_dynamo/variables/torch.py`, changing 'paramter' to the correct spelling 'parameter'.
These small but meaningful changes help improve code readability and maintain the overall quality of the codebase.
Thanks for your time and review!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157651
Approved by: https://github.com/Skylion007
Previously when processing `sym_and(a, b, c)`, symbolic shapes wouldn't individually process a, b, and c and store their implications. This would lead us to data-dependent error on individual checks, e.g. we stored `u0 >= 0 & u0 <= 10`, but then couldn't figure out `u0 <= 10`.
This handles that, and also makes `sym_and/or` user-code friendly, for testing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/154737
Approved by: https://github.com/laithsakka
Fixes#129673
### Summary:
Modifying a tensor by reshaping in place (such as `unsqueeze_`) should cause a graph break; however, when accessed through `torch.Tensor` api as opposed to as self attribute caused the code to crash with an error (see attached issue)
Paths differed when traced due to the stack variable popped, as:
* `self.unsqueeze_` pops a `LazyVariableTracker` which gets resolved to `TensorVariable`, so when looking for the method, triggers the fn call `var_getattr` in `_dynamo/variables/tensor.py`; since this is an inplace view (metadata mutation) on graph input, it is not well supported so should fall back (see [L446](1017927c83/torch/_dynamo/variables/tensor.py (L446)) in that file)
* `torch.Tensor.unsqueeze` pops a `UserDefinedClassVariable` so when looking for the method, triggers the fn call `var_getattr` in `_dynamo/variables/user_defined.py` on [L273](a8f6b40e36/torch/_dynamo/variables/user_defined.py (L273)). This path tries to build a variable tracker from the obj popped, which resolves to a trace_rule , and as a Tensor method, is resolved to `TorchInGraphFunctionVariable` on [L3767](a8f6b40e36/torch/_dynamo/trace_rules.py (L3767))
So, one straightforward option is to check if the fn is an inplace_view on a input tensor in `torch.py` when we resolve the `__call__function` for the `TorchInGraphFunctionVariable` instead, which resolves the bug by providing a graph break
### Test
```
pytest test/dynamo/test_functions.py::FunctionTests::test_unsqueeze_inplace
```
Results in
```
Running 1 items in this shard
test/dynamo/test_functions.py . [100%]
=========================================================================================== 1 passed in 9.16s ==========================================================================================
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150573
Approved by: https://github.com/anijain2305
This patch effectively ignores traceable_tensor_subclasses, allowing
Dynamo to always try tracing into the `__torch_function__` of tensor
subclass. This helps us with 2 things:
1. allowing users to directly benefit from better compilation of tensor
subclass, by just upgrading pytorch, without having to change legacy
library code (see earlier patches in the stack for examples).
2. potentially exposing more issues in compiling tensor subclass, so we
can get signals and improve them.
As a consequence, it exposed and fixes 2 subtle bugs:
1. In `build_torch_function_fn`, we could get
`torch._C._disabled_torch_function_impl` because we have a
`Parameter` subclass without `__torch_function__` override or if we
have a tensor subclass with `__torch_dispatch__` override. We graph
break on this for now, and plan to add support -- the logic for
simulating `torch._C._disabled_torch_function_impl` is already in
`SuperVariable`, we just need to reuse it.
2. Sometimes we create `SyntheticLocalSource` and need to remove all the
guards installed on it, but we only removed the ones whose source
_is_ the created synthetic source `s`, but forgot about chained
source like `s.foo`, this showed up as
`SYNTHETIC_LOCAL['tmp_0'].__torch_function__.__func__`.
Differential Revision: [D71906141](https://our.internmc.facebook.com/intern/diff/D71906141)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149792
Approved by: https://github.com/jansel, https://github.com/mlazos
ghstack dependencies: #149482, #149483, #149484
This patch effectively ignores traceable_tensor_subclasses, allowing
Dynamo to always try tracing into the `__torch_function__` of tensor
subclass. This helps us with 2 things:
1. allowing users to directly benefit from better compilation of tensor
subclass, by just upgrading pytorch, without having to change legacy
library code (see earlier patches in the stack for examples).
2. potentially exposing more issues in compiling tensor subclass, so we
can get signals and improve them.
As a consequence, it exposed and fixes 2 subtle bugs:
1. In `build_torch_function_fn`, we could get
`torch._C._disabled_torch_function_impl` because we have a
`Parameter` subclass without `__torch_function__` override or if we
have a tensor subclass with `__torch_dispatch__` override. We graph
break on this for now, and plan to add support -- the logic for
simulating `torch._C._disabled_torch_function_impl` is already in
`SuperVariable`, we just need to reuse it.
2. Sometimes we create `SyntheticLocalSource` and need to remove all the
guards installed on it, but we only removed the ones whose source
_is_ the created synthetic source `s`, but forgot about chained
source like `s.foo`, this showed up as
`SYNTHETIC_LOCAL['tmp_0'].__torch_function__.__func__`.
Differential Revision: [D71906141](https://our.internmc.facebook.com/intern/diff/D71906141)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149792
Approved by: https://github.com/jansel, https://github.com/mlazos
ghstack dependencies: #149482, #149483, #149484