Commit Graph

421 Commits

Author SHA1 Message Date
6c8ef6a4c2 Add tracing context, Integrate dynamo guards into torch._guards (#90647)
As defined here: https://docs.google.com/document/d/1oniZEgAaHE1IMByPRWRKbUHeaW06E2HMfCTCQyMRLek/edit#

This PR creates a new structure, a TracingContext, whose lifecycle matches that of the traced frame. It carries on it a GuardsContext, and eventually, a FakeTensorMode. It is the source of truth of all accumulated guards.

In this PR, we create the structure, and integrate it into dynamo. We do so by mapping OutputGraph's guards structure to its guard structure.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90647
Approved by: https://github.com/ezyang
2022-12-14 07:35:32 +00:00
9447005ae3 Improve dynamo debug logging (#90664)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90664
Approved by: https://github.com/voznesenskym
2022-12-12 02:35:23 +00:00
d5c6a74699 Rewrite dynamo cond() handling to not recursively call export (#90286)
The original implementation of cond() operator support in dynamo operated by recursively calling export() on the inner subgraph.  This is problematic for a number of reasons:

* My original motivating reason: the original implementation had to play tricks to feed real tensors to the recursive export call, which means that it doesn't work well with tracing with dynamic shapes (where we MUST stay in fake tensors to accurately track dynamic shapes across the cond invocation)
* If there are pending side effects, the recursive export() call won't see those side effects (as they are only tracked by Dynamo, not actually applied to the Python environment.) You can see an example where dynamo cond tracing does the wrong thing at https://github.com/pytorch/pytorch/pull/90208
* If there were side effects inside the true/false branch, these side effects were silently lost (as the export only returns the graph of tensor operations, and not any of the residual Python bytecodes necessary to reapply any side effects.) This could have substantive effects on the export of subsequent parts of the model, as those parts of the models could rely on the side effects.
* It was not possible to track NN module accesses inside the true/false branches, necessitating a hack where the NN module was explicitly passed in as an input to cond https://github.com/pytorch/pytorch/pull/87020#issuecomment-1338842844 which doesn't really make any sense from a backend compilation perspective
* Guards induced from the inside of the true/false branch were not properly propagated to the top level guards; they were just silently dropped (in fact, the original implementation checked that the true/false branch produce the same guards which... is not useful? Like, I don't think that actually is even necessary for correctness)

This PR replaces the old implementation with a new implementation based on graphstate checkpointing. The basic idea is to process a cond(), we checkpoint the state of our interpreter, run the true branch, rollback to our checkpoint, run the false branch, rollback to our checkpoint and then merge the changes from both of the checkpoints. I require the true/false branches to have exactly the same side effects, but union their guards.

Some of the details:

* Dynamo is too aggressive with tracking side effects when processing closures, c.f. https://github.com/pytorch/torchdynamo/pull/233/files#r1040480078 The basic problem is whenever I define a closure, this immediately counts as a side effect, even if I didn't actually mutate anything. This triggered on the nested cond export example. To prevent this from happening, I optimistically avoid tracking side effects, but if a STORE_DEREF happens, I restart analysis with the relevant Source.name() added to `mutated_closure_cell_contents` so we start tracking on closure allocation. This is enough to fix the relevant test.
* For the most part, I assert that the graph states must be equivalent after applying the true/false branches. During debugging, I found it useful to be able to compare two graph states and give a better description about what the divergence was. You can test this using the `diff()` method I've added to a few structures.
* The implementation now supports NestedUserFunctionVariable, which is nice as it allows the true/false branches to be defined closer to the cond implementation.
* I fixed the naming of the true/false subgraphs; previously they were named `name_0`, `name_1`, now they are named `cond_true_0` and `cond_false_0`
* I added `name_to_input` to the saved graph state. I don't actually know if this is necessary, but it seemed like a good idea.
* I have to play some tricks to get the speculating execution of the true/false branch to record into a subgraph. After a careful read of OutputGraph, I found that what would work is overriding graph with a fresh Graph that we want to write things into, and manually setting up the inputs/outputs. It's a little delicate as you have to make sure you reset the Graph to its original before you restore a checkpoint, as checkpoints don't actually save graph for efficiency, and just undo changes on the graph. This capability may usefully get refactored to OutputGraph but I didn't do it in this PR for simplicity.

There are some further problems with the cond() implementation that I leave for future work. Most of these were preexisting with the original implementation.

* Not a problem per se, but if an NN module is used by both the true/false branch, it will show up in the final graph twice (since it has to be a submodule of the GraphModule that makes use of it.) I hope the export pipeline can deal with this.
* List of tensor output for cond is not supported.
* The true/false return values may not have consistent sizes/dims/etc, and we don't check them for consistency.
* If we modify fake tensors in the true/false branches, we aren't rolling them back, c.f. https://github.com/pytorch/torchdynamo/issues/1840

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90286
Approved by: https://github.com/voznesenskym
2022-12-08 01:05:12 +00:00
7abd035b2f Add missing mypy-nofollow.ini (#90179)
I'm not sure how lintrunner worked without this lol.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90179
Approved by: https://github.com/albanD, https://github.com/voznesenskym
2022-12-08 01:05:12 +00:00
4cdc96fb4f Add hooks structure for passing around user provided hooks, add a new guard_failure_fn (#90371)
This PR introduces a new function we can pass to torch._dynamo.optimize - guard_failure_fn. Usage is in the PR, and the one stacked on top of it, but the gist of it is that it emits failed guard reason strings alongside code. This is useful for tests and debugging, as it gives far finer grained assertions and control than the compile counter alone.

This is a resubmit of https://github.com/pytorch/pytorch/pull/90129

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90371
Approved by: https://github.com/ezyang
2022-12-07 17:51:53 +00:00
d224ac7f77 Remove logging.CODE (#90234)
Fixes https://github.com/pytorch/torchdynamo/issues/1932

Discussed with @mlazos: if we still want to separate streams for code logging and the rest of info, we can use a separate logger object with a unique name.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90234
Approved by: https://github.com/ezyang
2022-12-06 22:24:43 +00:00
99dac4dd48 Type torch._dynamo.guards (#89919)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89919
Approved by: https://github.com/albanD
2022-12-01 13:43:10 +00:00
f45fe7de33 Add mypy checking for a few files in torch/_dynamo (#89731)
It's kind of intractable to enable mypy everywhere at the moment,
because there are a lot of errors, and also mypy is really slow
for some reason.  I just want enough types to explain the public
types for user compiler calls, going through typing the _C.dynamo
bindings along the way.  This is a first step for this.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89731
Approved by: https://github.com/suo
2022-11-28 13:14:06 +00:00
b04dda4291 Delay verify correctness wrapping to call site. (#89662)
There is only one call site for compiler_fn, so we can safely delay
wrapping verify correctness to here.  This will help later when we
change the backend compiler calling convention to pass fake tensors
(but I need to pass real tensors here.)

This is adapted from voz's changes at https://github.com/pytorch/pytorch/pull/89392
but with less changes to the substantive logic.  I only moved the relevant
inner implementation; there are no changes otherwise.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89662
Approved by: https://github.com/voznesenskym
2022-11-25 20:43:11 +00:00
897d029a73 [reland][dynamo] fixes dict changed during runtime error (#88877)
Reland https://github.com/pytorch/pytorch/pull/87526

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88877
Approved by: https://github.com/ezyang
2022-11-13 16:20:45 +00:00
0de8f047c1 Revert "[dynamo] fixes dict changed during runtime error (#87526)"
This reverts commit cf04b36ce8f531730210b03eaa347977a1c2d75c.

Reverted https://github.com/pytorch/pytorch/pull/87526 on behalf of https://github.com/anijain2305 due to error reported
2022-11-11 04:19:08 +00:00
cf04b36ce8 [dynamo] fixes dict changed during runtime error (#87526)
Fixes https://github.com/pytorch/torchdynamo/issues/1744

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87526
Approved by: https://github.com/ezyang
2022-11-10 01:57:17 +00:00
5220d07d2c Fix minifier accuracy msg (#88515)
Fixes https://github.com/pytorch/torchdynamo/issues/1809

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88515
Approved by: https://github.com/yanboliang, https://github.com/williamwen42
2022-11-04 23:26:44 +00:00
2bda2baad7 [Dynamo][Easy] Fix config.suppress_errors error log (#88402)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88402
Approved by: https://github.com/williamwen42
2022-11-03 18:03:36 +00:00
4d62ee1b36 Verbose exc printing fix (#88387)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88387
Approved by: https://github.com/tugsbayasgalan
2022-11-03 17:59:05 +00:00
12dd877395 Fix all references to torchdynamo from the merge (#87731)
cc @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @chunyuan-w @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @jansel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87731
Approved by: https://github.com/yanboliang, https://github.com/ezyang, https://github.com/anijain2305, https://github.com/jansel
2022-10-31 06:51:07 +00:00
9691ba2dbd Remove excess exception logging for minifier, cleanup backend failure exception format (#87537)
Fixes https://github.com/pytorch/torchdynamo/issues/1376

Ensures exceptions are printed only in one place, once.

implements some of the ideas from https://github.com/pytorch/torchdynamo/issues/1754
- Attaches a field to the exception which indicates that it's minified, a usage message is printed if this field is present

cc @jansel @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 @lezcano @fdrocha
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87537
Approved by: https://github.com/anijain2305
2022-10-28 21:33:55 +00:00
a605a30732 Fix CODE level usage in dynamo config.py (#87522)
Fixes https://github.com/pytorch/torchdynamo/issues/1718.

Tested by changing `log_level = logging.WARNING` in config.py to `log_level = logging.CODE` and running a test script that doesn't touch `log_level`.

cc @jansel @lezcano @fdrocha @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87522
Approved by: https://github.com/mlazos
2022-10-25 22:47:54 +00:00
bc19494814 [Dynamo] Symbolic shape guards (#87570)
**Introduces symbolic shape guards into dynamo.**

In this PR, we take the existing fake tensor infra and plumbing in dynamo and we start passing a shape_env around. This shape_env does not get plumbed down to middle layers / backend yet - it only collects expressions from frontend invocations at the moment. We then translate these expressions into guards at the point where we take other guards installed throughout dynamo - and add them to check_fn.

Part 1 of https://docs.google.com/document/d/1QJ-M4zfMkD-fjHIqW089RptjLl9EgozZGCceUbvmgfY/edit#

cc @jansel @lezcano @fdrocha @mlazos @soumith @yanboliang @penguinwu @anijain2305
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87570
Approved by: https://github.com/ezyang
2022-10-25 21:15:40 +00:00
96691865b9 [dynamo] Unify raise_on_* config to suppress_errors and raise by default (#87440)
I noticed that a lot of bugs are being suppressed by torchdynamo's default
error suppression, and worse yet, there's no way to unsuppress them.  After
discussion with voz and soumith, we decided that we will unify error suppression
into a single option (suppress_errors) and default suppression to False.

If your model used to work and no longer works, try TORCHDYNAMO_SUPPRESS_ERRORS=1
to bring back the old suppression behavior.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

cc @jansel @lezcano @fdrocha @mlazos @soumith @voznesenskym @yanboliang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87440
Approved by: https://github.com/voznesenskym, https://github.com/albanD
2022-10-21 17:03:29 +00:00
c7c09722ad Move TorchDynamo into PyTorch core (#86461)
Context:
https://github.com/pytorch/torchdynamo/issues/1588

This PR moves [TorchDynamo](https://github.com/pytorch/torchdynamo) and TorchInductor into PyTorch core.
- `torchdynamo` becomes `torch._dynamo`
- `torchinductor` becomes `torch._inductor`

This PR was generated by running `copy_to_core.sh` in https://github.com/pytorch/torchdynamo/pull/1538

Pull Request resolved: https://github.com/pytorch/pytorch/pull/86461
Approved by: https://github.com/voznesenskym
2022-10-13 23:18:06 +00:00