As title. This is a follow-up of the previous patch, with the goal of
supporting a new pattern that showed up in ComfyUI:
644b23ac0b/comfy/ops.py (L44)
Effectively, the semantics of calling a function decorated with a
context manager is:
```python
@ctx_manager(args)
def f(x):
...
f(x)
# ----->
with ctx_manager(args):
f.__wrapped__(x)
```
Yes, a fresh context manager instance per invokation, see CPython source code:
https://github.com/python/cpython/blob/3.12/Lib/contextlib.py#L119-L122
So Dynamo already
1. knows how to handle the `with ctx_manager(args)` syntax, and has
special handling for a few torch native context managers, like
`sdpa_kernel` in this patch.
2. can trace through a good chunk (at least the ones that matter in this
case) of contextlib.
This patch just let Dynamo trace a bit more into contextlib, and then
keep the torch-native special cases by moving their handling a bit down
the stack, so that no additional logic is introduced -- it's only
refactored.
This also allows us to get rid of some `_sdpa_kernel_variadic` special
handling, since now we will trace through its code, and it boils down to
`sdpa_kernel` anyways.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/160703
Approved by: https://github.com/guilhermeleobas, https://github.com/mlazos
ghstack dependencies: #160684
Before:
```
.Observed exception
Explanation: Dynamo found no exception handler at the top-level compiled function when encountering an exception. Exception will propagate outside the compiled region.
Hint: Dynamo has detected that tracing the code will result in an error when running in eager. Please double check that your code doesn't contain a similar error when actually running eager/uncompiled.
Hint: It may be possible to write Dynamo tracing rules for this code. Please report an issue to PyTorch if you encounter this graph break often and it is causing performance issues.
Developer debug context:
```
After:
```
Observed exception
Explanation: Dynamo found no exception handler at the top-level compiled function when encountering an exception. Exception will propagate outside the compiled region.
Hint: Dynamo has detected that tracing the code will result in an error when running in eager. Please double check that your code doesn't contain a similar error when actually running eager/uncompiled.
Hint: It may be possible to write Dynamo tracing rules for this code. Please report an issue to PyTorch if you encounter this graph break often and it is causing performance issues.
Developer debug context: raised exception TypeError([ConstantVariable(str: "unhashable type: <class 'torch._dynamo.variables.dicts.SetVariable'>")])
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158924
Approved by: https://github.com/williamwen42, https://github.com/zou3519
Summary:
This PR adds support for torch.cuda.FloatTensor and friends in Dynamo.
These are indeed legacy APIs, but that doesn't stop us from adding
support for them in torch.compile.
I add support for these in the same way that we support torch.Tensor:
these APIs can be safely put into the Dynamo graph.
Fixes#130722
Test Plan:
- new test
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156107
Approved by: https://github.com/williamwen42
Basically adds native _IntWrapper support to dynamo. Here's my process of trying to make symint input support work on dynamo, and how I ended up with this approach [(doc)](https://docs.google.com/document/d/1GvNRQd8BnxlMay_hrEVgEta6VUeUW_hcFeRuB7q1nDY/edit?tab=t.0).
What I did was, before passing inputs to dynamo.export, I first wrap them with a class, `_IntWrapper`. When processing dynamic shapes, I will then add the corresponding dynamic shape specification to the `dynamism` field stored on the `_IntWrapper`. If there is no dynamism specified, then this will get unwrapped back to an integer. When dynamo tracing, when we encounter an `_IntWrapper`, we will convert this to a symint if the dynamism was specified as `Dim.DYNAMIC/AUTO`. Dynamo will then trace a graph that contains symint inputs, which will get passed to AOTAutograd and so on.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152677
Approved by: https://github.com/pianpwk
This relaxes the guard introduced in #100444 (which aggressively guard
on the object id, despite Dynamo is just tracing its `__call__` method.
This allows users to bypass the high compilation time issue in #150706
by compiling transformer blocks only. Without this patch, we'd get lots
of unnecessary recompilation, as the block has difference attention
processor instances.
Compiling blocks only _significantly_ speeds up compilation process
(from ~310s to ~32s), and even speeds up e2e performance for some reason
(7.83s to 7.67s).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152395
Approved by: https://github.com/anijain2305
ghstack dependencies: #152369
This fixes most of the "torch.compile X tensor-subclass" issues
encountered in https://github.com/city96/ComfyUI-GGUF/issues/118. The
relevant tensor subclass definition is here:
298192ed60/ops.py (L18-L65).
A few things to note about the tensor subclass:
1. it overrides a lot of the `torch.Tensor` methods (e.g., `to`,
`clone`), so this patch updates `TensorWithTFOverrideVariable.var_getattr`
to support that.
2. it overrides the `shape` property, so this patch updates
`TensorWithTFOverrideVariable.var_getattr` to support property as well.
3. it has calls to `torch.Tensor.size`, which returns `torch.Size`,
which gets reconstructed in `torch.Tensor.__torch_function__`, so
this patch adds support for calling `torch.Size(...)` on non-constant
inputs.
Differential Revision: [D71906137](https://our.internmc.facebook.com/intern/diff/D71906137)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149484
Approved by: https://github.com/jansel, https://github.com/mlazos
ghstack dependencies: #149482, #149483
This fixes most of the "torch.compile X tensor-subclass" issues
encountered in https://github.com/city96/ComfyUI-GGUF/issues/118. The
relevant tensor subclass definition is here:
298192ed60/ops.py (L18-L65).
A few things to note about the tensor subclass:
1. it overrides a lot of the `torch.Tensor` methods (e.g., `to`,
`clone`), so this patch updates `TensorWithTFOverrideVariable.var_getattr`
to support that.
2. it overrides the `shape` property, so this patch updates
`TensorWithTFOverrideVariable.var_getattr` to support property as well.
3. it has calls to `torch.Tensor.size`, which returns `torch.Size`,
which gets reconstructed in `torch.Tensor.__torch_function__`, so
this patch adds support for calling `torch.Size(...)` on non-constant
inputs.
Differential Revision: [D71906137](https://our.internmc.facebook.com/intern/diff/D71906137)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149484
Approved by: https://github.com/jansel, https://github.com/mlazos
ghstack dependencies: #149482, #149483