This replaces fake_mode_from_tensors but it preferentially looks for
fake_mode in TracingContext and also if there is an active fake mode
on the dispatch stack, before groveling in tensors to find it.
This advances PegasusForCausalLM, which was previously failing because
we generated a graph that had a parameter (non-fake) and a SymInt,
and thus previously we failed to detect the correct fake mode.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98321
Approved by: https://github.com/voznesenskym
The purpose of this PR is to remove reliance on argument positions in dedup guards, AND extend the functionality to params.
A version of this PR was stamped prior https://github.com/pytorch/pytorch/pull/95831 - but was kinda gross, because it was based on an underlying PR that did way too much with source names.
This PR leaves most of that alone, in favor of just reusing the same name standardization logic that dynamo module registration does.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96774
Approved by: https://github.com/ezyang
Previously, when starting to trace a function, we would record a frame summary recording the definition loc. This would lead to an unconventional-looking stack trace when used for debugging, e.g., shape guards.
```
File ".../scripts/avik/pt2/example.py", line 407, in forward
def forward(self, x):
...
File ".../transformers/models/bert/modeling_bert.py", line 912, in forward
@add_start_docstrings_to_model_forward(BERT_INPUTS_DOCSTRING.format("batch_size, sequence_length"))
...
File ".../transformers/models/bert/modeling_bert.py", line 562, in forward
def forward(
...
File ".../transformers/models/bert/modeling_bert.py", line 484, in forward
def forward(
...
File ".../transformers/models/bert/modeling_bert.py", line 416, in forward
def forward(
...
File ".../transformers/models/bert/modeling_bert.py", line 275, in forward
def forward(
...
File ".../transformers/models/bert/modeling_bert.py", line 351, in forward
attention_scores = attention_scores + attention_mask
```
As noted in https://github.com/pytorch/pytorch/pull/95848#discussion_r1134397096, we would like to change this to record function calls instead, like conventional stack traces do. This diff makes this change. The above stack now looks like the following, which is way more helpful at a glance to understand what's going on.
```
File ".../scripts/avik/pt2/example.py", line 408, in forward
bert_out = self.bert(**x)
...
File ".../transformers/models/bert/modeling_bert.py", line 1021, in forward
encoder_outputs = self.encoder(
...
File ".../transformers/models/bert/modeling_bert.py", line 610, in forward
layer_outputs = layer_module(
...
File ".../transformers/models/bert/modeling_bert.py", line 496, in forward
self_attention_outputs = self.attention(
...
File ".../transformers/models/bert/modeling_bert.py", line 426, in forward
self_outputs = self.self(
...
File ".../transformers/models/bert/modeling_bert.py", line 351, in forward
attention_scores = attention_scores + attention_mask
```
Differential Revision: [D44101882](https://our.internmc.facebook.com/intern/diff/D44101882/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96882
Approved by: https://github.com/ezyang
I am still reading Dynamo source code...
This is an easy PR to simplify `Source.is_nn_module()` to reuse `GuardSource.is_nn_module()` instead of having the `in (...)` check implemented twice. While simplifying that, I thought I might as well add some type annotations for `Source` methods.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95292
Approved by: https://github.com/ezyang
Whenever you guard on something, you're supposed to tell GuardBuilder about it, so GuardBuilder knows that it has to actually bind it in scope when it creates the guard function. But shape env guards bypass that mechanism completely. Well, now they don't.
For the most part, this didn't matter in practice, because we usually had a `TENSOR_MATCH` guard floating around that made sure that the guard stayed live. But if we ever eliminate those guards (e.g., because we build it into the shape guard directly; something we'll probably want to do when https://github.com/pytorch/pytorch/pull/89707 goes online) then this will indeed matter.
One complication: some of the shape env guards are on globals. You have to make sure to shunt the usage to the correct guard builder in that case. Maybe it would be better if we refactored things so there is only one GuardBuilder. Not sure.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91058
Approved by: https://github.com/voznesenskym
I'm going to need this in the follow up PR. Instead of storing only Source.name() in Symbol, I now store a full on Source. Lots of replumbing reoccurs. In particular:
- Move Source to torch._guards to break cycles
- I have to add TensorPropertySource and NegateSource to handle x.size()[0] and -x codegen that I was doing with string manipulation previously
- I tighten up invariants so that I never pass source=None; instead I pass ConstantSource (these are constant sources right) and test for that rather than source being missing. I think this is more parsimonious
- Some mypy wobbles from new imports
I didn't move LocalSource and friends to torch._guards, but I ended up needing to access them in a few places. The main annoyance with moving these is that then I also need to move the bytecode codegen stuff, and that's not so easy to move without bringing in the kitchen sink.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91057
Approved by: https://github.com/albanD, https://github.com/voznesenskym, https://github.com/zou3519
I'm going to need this in the follow up PR. Instead of storing only Source.name() in Symbol, I now store a full on Source. Lots of replumbing reoccurs. In particular:
- Move Source to torch._guards to break cycles
- I have to add TensorPropertySource and NegateSource to handle x.size()[0] and -x codegen that I was doing with string manipulation previously
- I tighten up invariants so that I never pass source=None; instead I pass ConstantSource (these are constant sources right) and test for that rather than source being missing. I think this is more parsimonious
- Some mypy wobbles from new imports
I didn't move LocalSource and friends to torch._guards, but I ended up needing to access them in a few places. The main annoyance with moving these is that then I also need to move the bytecode codegen stuff, and that's not so easy to move without bringing in the kitchen sink.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91057
Approved by: https://github.com/albanD, https://github.com/voznesenskym
The idea is to make ShapeEnv guards less of a one-off special snowflake, and integrate it more closely with the regular builder infrastructure. But it is not so easy: the shape env code has to live after tensor match code, because we need to know that the values in question are tensors before we start matching on them. So we introduce a new `shape_env_code` field to put the special shape env code, so we can add it to the final constructed code after tensor.
Everything else works the obvious way. There's a new ShapeEnvSource for constructing the singleton SHAPE_ENV guard that drives the shape env guard construction. I added some more docs and also made the printed code for guards include the enclosing lambda for more clarity.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91055
Approved by: https://github.com/albanD, https://github.com/voznesenskym