Lazos correctly pointed out this doesn't make sense for compile since
we graph break in compile. This results in tons of unwanted user log
spew. We do want this in export though since it's drastiaclly reduced
the support load for DDEs. This PR does the refactor to keep it in
export but remove it from compile
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149831
Approved by: https://github.com/mlazos
This gives us a decent proxy for how big of a graph we functionally had to parse.
Note that this is a cummulative counter. If people feel strongly, I can either write into the dynamo_timed datasets with metrics contexts, or clear the counters / write a counter per frame id as well.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147149
Approved by: https://github.com/jansel
Also show the line of code relevant to a dynamo-compiled frame, instead of just the first line (this was broken for data-dependent jump graph breaks and for 3.11+).
Also collapses resume frames together (use config.verbose to see full stack trace - for developers).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/148401
Approved by: https://github.com/zou3519, https://github.com/jansel
Recently I've been experimenting with introducing new APIs to delay compile as a way to reduce compile times while improving the ergonomics of using dynamic shapes. The high level idea is to run the first invocation of compile in eager, save the example inputs, and on the second invocation we can derive the dynamism in the inputs so that we don't need to waste our time doing a compile with static shapes (which is the status quo today with automatic dynamic).
Another benefit of this is most users no longer need to annotate their inputs with mark_dynamic and mark_unbaked calls since we can derive the dynamism on the very first call. Additionally we get dynamic ints out of the box in this new regime.
This PR implements this idea through the set_stance APIs. In particular it introduces a new `eager_then_compile` stance.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147983
Approved by: https://github.com/williamwen42
This is for "for some large number Z, make sure the error messages are readable English." - beginning to audit all `unimplemented` sites and making sure that all messages are at least English-readable. Hints may not necessarily be provided.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147385
Approved by: https://github.com/jansel
Earlier if there were no ops in the graph, fullgraph=True will also fallback to eager. This hides issues in testing, where we silently fallback to eager, and do not test optimized bytecode. As can be seen in the PR, I had to fix several tests when I forced to use the optimized bytecode in the absence of graph. A few failing tests will be fixed in follow up PRs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146527
Approved by: https://github.com/zou3519, https://github.com/StrongerXi
User defined NN module might have their own `__len__` or `__bool__`
methods which Dynamo needs to trace through, so that side effects and/or
reads to buffered writes are properly handled.
This patch removes the special `UnspecializedNNModuleVariable` branch in
Dynamo's branch handling, and lets these cases fall into the
`UserDefinedObjectVariable` branch, which handles the aforementioned
cases correctly.
Fixes#145284.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145786
Approved by: https://github.com/williamwen42
This patch models input cell object as "newly created" rather than
"pre-existing" python object (see added documentation for why this
actually captures the semantics more accurately).
This enables the `SideEffects.prune_dead_object_new` algorithm to prune
away writes to input cell objects which are no longer relevant; this
didn't happen prior to this patch because we modelled them as
pre-existing objects, which forces us to codegen their attribute
mutations.
Fixes#145564.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145781
Approved by: https://github.com/williamwen42, https://github.com/jansel
This is primarily sent for discussion and to see what tests fail due to
this. The idea is that rather than capturing this as a regex on the
fail_reason, just give it a unique failure type
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144785
Approved by: https://github.com/ezyang
Fixes#130559
* Intro
This PR adds support for `@contextmanager` in Dynamo. We chose to limit the
scope of this work to only `@contextmanager` and plan to handle generators fully
in #141055 (still in draft).
* Motivation
Dynamo lacks support for generator functions. When it encounters one, it traces
it as if it were a regular function. This is problematic because it can lead to
incorrect behavior. To illustrate, consider the test case below:
```python
import torch
import contextlib
@contextlib.contextmanager
def set_default_dtype(dtype):
old_dtype = torch.get_default_dtype()
try:
torch.set_default_dtype(dtype)
yield
finally:
torch.set_default_dtype(old_dtype)
@torch.compile(backend="eager", fullgraph=True)
def fn():
with set_default_dtype(torch.float64):
x = torch.tensor([3.0, 3.0 + 5.0j])
return x
```
Before this work, Dynamo would not stop at the `yield`, and the graph produced
would contain both calls to `set_default_dtype` executed one after the other.
This is incorrect because the context manager should execute code before and
after the `yield`.
* List of changes
`YIELD_VALUE` now raises an exception (`YieldValueOp`) to signal that control
flow must be suspended and returned to the caller. Additionally, `RETURN_VALUE`
behaves differently in a generator function. Unlike regular functions, where
`RETURN_VALUE` indicates the final result, in generators it signifies that the
generator is exhausted and implicitly raises `StopIteration`.
A new `VariableTracker` named `FunctionDecoratedByContextlibContextManagerVariable`
was introduced to handle `@contextmanager`. This variable tracker acts not just
as a wrapper for the original function but also maintains an internal `tx`
(InstructionTranslator) object to suspend and return control flow to the parent
tracer when a `yield` is encountered.
* Corner cases
Returning a context manager from a compiled function is not supported. This
would require PyTorch to synchronize the generator state between Dynamo and the
interpreter. Any attempt to return it will result in an `IncorrectUsage`
exception.
Graph breaks require special handling as well. In the event of a graph break,
the frame associated with the context manager is skipped, and the context
manager runs in eager mode.
* This PR is breaking my code
There is a configuration flag (`enable_trace_contextlib`) that can be set to
`False` to disable tracing context managers. If this still causes crashes,
please revert this PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136033
Approved by: https://github.com/zou3519