91 Commits

Author SHA1 Message Date
a43c4c3972 [5/N] Apply ruff UP035 rule (#164423)
Continued code migration to enable ruff `UP035`. Most changes are about moving `Callable` from `typing` to `from collections.abc`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164423
Approved by: https://github.com/ezyang
2025-10-02 07:31:11 +00:00
991e3d0d16 [dynamo][guards] Revert introduction of different types of lambda_guards (#163385)
With
https://fb.workplace.com/groups/260102303573409/permalink/787294574187510/
issue, it might be a better idea to just speedup _realize_dict and keep
the changes very local. So reverting this PR as well, to return to clean
slate.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163385
Approved by: https://github.com/jansel
2025-09-27 18:20:48 +00:00
1302637a23 Revert "[dynamo][guards] Do not construct entire framelocals dict for LAMBDA_GUARD (#162525)"
This reverts commit 5f630d28d7ff9fdd8bd6cdbe2438e5c821007845.

Reverted https://github.com/pytorch/pytorch/pull/162525 on behalf of https://github.com/anijain2305 due to internal tests fail ([comment](https://github.com/pytorch/pytorch/pull/162525#issuecomment-3310748980))
2025-09-19 06:15:28 +00:00
79d2418b5a [inductor] Add FLOAT_IS_NAN and COMPLEX_IS_NAN guards (#162537)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162537
Approved by: https://github.com/anijain2305, https://github.com/mlazos
ghstack dependencies: #162528
2025-09-12 04:32:46 +00:00
5dd84559a5 [dynamo] Add DUAL_LEVEL_MATCH C++ guard (#162528)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162528
Approved by: https://github.com/anijain2305
2025-09-12 04:32:46 +00:00
5f630d28d7 [dynamo][guards] Do not construct entire framelocals dict for LAMBDA_GUARD (#162525)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162525
Approved by: https://github.com/williamwen42
ghstack dependencies: #162509
2025-09-10 18:52:15 +00:00
a67e798cb7 [dynamo][guards] Prevent framelocals to dict conversion for not required LAMBDA_GUARD (#162509)
This is a smaller PR to reduce framelocals to dict conversion.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162509
Approved by: https://github.com/williamwen42
2025-09-10 18:52:15 +00:00
f0e0a6897e type misc init and tools for dynamo (#161293)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161293
Approved by: https://github.com/anijain2305
2025-08-26 17:38:49 +00:00
8d3d1c8443 [dynamo] fixes to propagate tag safeness (#159807)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159807
Approved by: https://github.com/jansel
2025-08-12 04:50:13 +00:00
40c4d61f9a [Dynamo][Better Engineering] Typing torch/_dynamo/guards.py (#159315)
As part of better engineering effort, we would like to improve out type support to improve dev experience in dynamo

This PR adds strict typing support to `torch/_dynamo/guards.py`

Running
```
mypy torch/_dynamo/guards.py --linecount-report /tmp/coverage_log
```

| -------- | Lines Annotated | Lines Total | % lines covered | Funcs Annotated | Funcs Total | % funcs covered |
| -------- | ------- | -------- | ------- | ------- | ------- | ------- |
| Main  |  2030 | 3945 | 51.46% | 70 | 138 | 50.72% |
| This PR | 4055 | 4055 | 100.00% | 138 | 138 | 100.00% |
| Delta    | +2025 | +90 | +48.54% | +68 | 0 | +49.28% |

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159315
Approved by: https://github.com/williamwen42, https://github.com/Skylion007
2025-08-06 21:52:14 +00:00
53e47af0f7 [dynamo][guards] Read the attr name from GetAttrGuardAccessor (#159754)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159754
Approved by: https://github.com/jansel
ghstack dependencies: #159752
2025-08-04 16:51:27 +00:00
64cbaa876c [dynamo][guards] Make class members go through obj.__class__.__dict__ (#159534)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159534
Approved by: https://github.com/jansel
2025-08-04 05:12:44 +00:00
805a102beb Revert "[dynamo][guards] Make class members go through obj.__class__.__dict__ (#159534)"
This reverts commit 1616777cd2a3170ff76afa3e7860b0969420c445.

Reverted https://github.com/pytorch/pytorch/pull/159534 on behalf of https://github.com/malfet due to Broke some inductor test and lint among other things, see 9c18901bfd/1 ([comment](https://github.com/pytorch/pytorch/pull/159534#issuecomment-3146983186))
2025-08-03 04:58:32 +00:00
1616777cd2 [dynamo][guards] Make class members go through obj.__class__.__dict__ (#159534)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159534
Approved by: https://github.com/jansel
ghstack dependencies: #159186
2025-08-02 18:04:35 +00:00
7eb5fdb358 [dynamo][guards] Recursive dict tag optimization (#159183)
Design doc here - https://docs.google.com/document/d/1W29DrWID5miGWlZXspsQVN5U0zydE3kjZpziOXrhuaY/edit?tab=t.0#bookmark=id.sba04iw9sp68

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159183
Approved by: https://github.com/jansel
2025-07-30 06:01:32 +00:00
a4d753295e [Dynamo][Better Engineering] Add enhanced typing support to _dynamo/eval_frame.py (#158276)
As part of better engineering week, we would like to improve out type support to improve dev experience in dynamo

This PR adds strict typing support to the main entrypoint for dynamo, `eval_frame.py`

Running
```
mypy torch/_dynamo/eval_frame.py --linecount-report /tmp/coverage_log
```

| -------- | Lines Unannotated | Lines Total | % lines covered | Funcs Unannotated | Funcs Total | % funcs covered |
| -------- | ------- | -------- | ------- | ------- | ------- | ------- |
| Main  |  623 | 2232 | 27.91% | 19 | 68 | 27.94% |
| This PR | 2285 | 2285 | 100.00% | 68 | 68 | 100.00% |
| Delta    | +1662 | +63 | +72.09% | +49 | 0 | +72.06% |

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158276
Approved by: https://github.com/williamwen42

Co-authored-by: William Wen <williamwen@meta.com>
2025-07-16 23:31:10 +00:00
0f9c1b374f [dynamo] Ensure global state guard is preserved across serialization. (#157285)
Currently, every time we construct a GLOBAL_STATE guard, we always create a fresh guard based on the current global state. For precompile, we want to create a GLOBAL_STATE guard always based on some external sources, e.g. serialized global states. This can also be applied with the normal case where we just pass in the global state guard from Python.

Differential Revision: [D77400988](https://our.internmc.facebook.com/intern/diff/D77400988/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157285
Approved by: https://github.com/jansel
2025-07-01 15:46:34 +00:00
c78fce9e79 [dynamo] show frame information when recompilation is triggered on fail_on_recompile (#156433)
adding more information to the error message for debugging.

example error message:
```
Detected recompile when torch.compile stance is 'fail_on_recompile'. filename: 'caffe2/test/dynamo/test_misc.py', function name: 'fn', line number: 0
Failed on the following precompiled guards:

TREE_GUARD_MANAGER:
+- RootGuardManager
| +- LAMBDA_GUARD: isinstance(L['x'], bool)
GuardDebugInfo(
result=0,
verbose_code_parts=["isinstance(L['x'], bool)"],
num_guards_executed=1)
```

Differential Revision: [D76987126](https://our.internmc.facebook.com/intern/diff/D76987126/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156433
Approved by: https://github.com/jamesjwu
2025-07-01 15:15:58 +00:00
17eb649d55 Implement guard collectives (optimized version) (#156562)
This is a remix of https://github.com/pytorch/pytorch/pull/155558

Instead of mediating guard collective via a config option, in this one it's done via a `set_stance` like API. The motivation is that checking for the config value on entry on torch.compile is apparently quite expensive, according to functorch_maml_omniglot. So this makes it a bit cheaper.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156562
Approved by: https://github.com/Microve
2025-06-24 04:59:49 +00:00
190f76fa31 Revert "Implement guard collectives (#155558)"
This reverts commit 5a5a05a6a3be376130848e235df73b752eef0230.

Reverted https://github.com/pytorch/pytorch/pull/155558 on behalf of https://github.com/malfet due to Hmm, may be I'm looking at the wrong metric, but c92f1075aa/1 shows that test started to pass after PR were reverted ([comment](https://github.com/pytorch/pytorch/pull/155558#issuecomment-2978337152))
2025-06-16 22:26:52 +00:00
5a5a05a6a3 Implement guard collectives (#155558)
When running a distributed job with compiler collectives enabled, if one rank recompiles while others do not, this leads to a deadlock (as not everyone will rendezvous with the compiler collective from the recompile). Although there aren't any convenient ways to cheaply solve this problem, if you are willing to force everyone to sync when evaluating guards, you can just force everyone to recompile if anyone requires a recompile. So the way guard collectives work is:

1. Perform compiled code lookup (evaluating guards)
2. Run a collective, communicating if you found a compiled code or not
3. If anyone requires recompile, force everyone to recompile

One current deficiency in the implementation is we can't conveniently track the time it takes to run this collective.

I need to test if we actually successfully are running the collective on a separate stream, or if we have to wait for user collectives to all finish.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155558
Approved by: https://github.com/Microve
2025-06-16 19:46:16 +00:00
61b271e0f3 Revert "Implement guard collectives (#155558)"
This reverts commit 38e5e81e55fc5d85d6cf8a83c96c88578995e3fe.

Reverted https://github.com/pytorch/pytorch/pull/155558 on behalf of https://github.com/atalman due to Breaks CI, sorry: [GH job link](https://github.com/pytorch/pytorch/actions/runs/15683161593/job/44181274826) [HUD commit link](38e5e81e55) ([comment](https://github.com/pytorch/pytorch/pull/155558#issuecomment-2977871178))
2025-06-16 19:40:46 +00:00
38e5e81e55 Implement guard collectives (#155558)
When running a distributed job with compiler collectives enabled, if one rank recompiles while others do not, this leads to a deadlock (as not everyone will rendezvous with the compiler collective from the recompile). Although there aren't any convenient ways to cheaply solve this problem, if you are willing to force everyone to sync when evaluating guards, you can just force everyone to recompile if anyone requires a recompile. So the way guard collectives work is:

1. Perform compiled code lookup (evaluating guards)
2. Run a collective, communicating if you found a compiled code or not
3. If anyone requires recompile, force everyone to recompile

One current deficiency in the implementation is we can't conveniently track the time it takes to run this collective.

I need to test if we actually successfully are running the collective on a separate stream, or if we have to wait for user collectives to all finish.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155558
Approved by: https://github.com/Microve
2025-06-16 14:09:14 +00:00
b2fc9cfea1 [precompile] Add CompilePackage to serialize dynamo states. (#155118)
Adding a per torch.compile() object CompilePackage which tracks dynamo artifact. CompilePackage is considered a low level component and should not be directly exposed to end users. It has the following interface:

1. `CompilePackage.__init__()` which optionally takes previously serialized dynamo states.
     a. when `dynamo` argument is None, it will contruct a brand new CompilePackage object.
     b. when `dynamo` argument is not None, it will load a pre-compiled dynamo state.
2. `package.save()` which dumps the dynamo states into _DynamoCacheEntry.
3. `package.install(backends)` which will handle all the side-effectful global scope updates with compiled functions and resume functions.

This diff focus on making the low level mechanism for precompile. It will be left to upper level interface to use these API to build more user-facing frontend.

Differential Revision: [D75956538](https://our.internmc.facebook.com/intern/diff/D75956538/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155118
Approved by: https://github.com/jamesjwu

Co-authored-by: James Wu <jjwu@meta.com>
2025-06-13 13:54:10 +00:00
10c3e6ec43 [inductor][dynamo] Include operator name in size/stride/alignment assertion (#152353)
Fixes #151930

This PR updates the `assert_size_stride` and `assert_alignment` functions in [guards.cpp](https://github.com/pytorch/pytorch/blob/main/torch/csrc/dynamo/guards.cpp) to accept an optional `op_name` argument and includes it in the error messages.

The corresponding type stubs in [guards.pyi](https://github.com/pytorch/pytorch/blob/main/torch/_C/_dynamo/guards.pyi) are updated to match the new function arg.

In [inductor/ir.py](https://github.com/pytorch/pytorch/blob/main/torch/_inductor/ir.py) extracts the operator name from the FX graph and passes it into the `codegen_size_asserts` and `codegen_alignment_asserts` functions, so that generated assertions in Triton code include the op name for better debugging.

Added unit tests inside [test_torchinductor.py](https://github.com/pytorch/pytorch/blob/main/test/inductor/test_torchinductor.py).
- Verified both successful and failing assertion cases include the operator name.
- Verified that generated Triton code contains the op name inside the asserts.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152353
Approved by: https://github.com/jansel, https://github.com/shunting314
2025-06-03 19:21:15 +00:00
4d073af58c Revert "[inductor][dynamo] Include operator name in size/stride/alignment assertion (#152353)"
This reverts commit 725bbb6b5fffa2f2d219a0692ed27e376c9dd48a.

Reverted https://github.com/pytorch/pytorch/pull/152353 on behalf of https://github.com/jeanschmidt due to seems to have broken a few internal tests, @jansel may you help the author get his PR merged? ([comment](https://github.com/pytorch/pytorch/pull/152353#issuecomment-2885997862))
2025-05-16 08:20:39 +00:00
22b124335e [BE] Update .pyi stub template to use Generic TypeAlias (PEP 585) and Union Type (PEP 604) (#150728)
https://github.com/pytorch/pytorch/pull/129001#discussion_r1645126801 is the motivation for the whole stack of PRs. In `torch/__init__.py`, `torch._C.Type` shadows `from typing import Type`, and there is no type stub for `torch._C.Type` in `torch/_C/__init__.pyi`. So we need to use `from typing import Type as _Type`. After enabling [Generic TypeAlias (PEP 585)](https://peps.python.org/pep-0585) in the `.pyi` type stub files, we can use `type` instead of `typing.Type` or `from typing import Type as _Type`.

------

- [Generic TypeAlias (PEP 585)](https://peps.python.org/pep-0585): e.g. `typing.List[T] -> list[T]`, `typing.Dict[KT, VT] -> dict[KT, VT]`, `typing.Type[T] -> type[T]`.
- [Union Type (PEP 604)](https://peps.python.org/pep-0604): e.g. `Union[X, Y] -> X | Y`, `Optional[X] -> X | None`, `Optional[Union[X, Y]] -> X | Y | None`.

Note that in `.pyi` stub files, we do not need `from __future__ import annotations`. So this PR does not violate issue #117449:

- #117449

------

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150728
Approved by: https://github.com/cyyever, https://github.com/aorenste
ghstack dependencies: #150726, #150727
2025-05-15 09:36:42 +00:00
725bbb6b5f [inductor][dynamo] Include operator name in size/stride/alignment assertion (#152353)
Fixes #151930

This PR updates the `assert_size_stride` and `assert_alignment` functions in [guards.cpp](https://github.com/pytorch/pytorch/blob/main/torch/csrc/dynamo/guards.cpp) to accept an optional `op_name` argument and includes it in the error messages.

The corresponding type stubs in [guards.pyi](https://github.com/pytorch/pytorch/blob/main/torch/_C/_dynamo/guards.pyi) are updated to match the new function arg.

In [inductor/ir.py](https://github.com/pytorch/pytorch/blob/main/torch/_inductor/ir.py) extracts the operator name from the FX graph and passes it into the `codegen_size_asserts` and `codegen_alignment_asserts` functions, so that generated assertions in Triton code include the op name for better debugging.

Added unit tests inside [test_torchinductor.py](https://github.com/pytorch/pytorch/blob/main/test/inductor/test_torchinductor.py).
- Verified both successful and failing assertion cases include the operator name.
- Verified that generated Triton code contains the op name inside the asserts.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152353
Approved by: https://github.com/jansel
2025-05-15 02:33:57 +00:00
7b806a8cb1 Revert "[inductor][dynamo] Include operator name in size/stride/alignment assertion (#152353)"
This reverts commit 93576351270383ca37deaec6b2417a33dc045a93.

Reverted https://github.com/pytorch/pytorch/pull/152353 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to fail an inductor test in trunk ([comment](https://github.com/pytorch/pytorch/pull/152353#issuecomment-2863657185))
2025-05-08 16:39:28 +00:00
9357635127 [inductor][dynamo] Include operator name in size/stride/alignment assertion (#152353)
Fixes #151930

This PR updates the `assert_size_stride` and `assert_alignment` functions in [guards.cpp](https://github.com/pytorch/pytorch/blob/main/torch/csrc/dynamo/guards.cpp) to accept an optional `op_name` argument and includes it in the error messages.

The corresponding type stubs in [guards.pyi](https://github.com/pytorch/pytorch/blob/main/torch/_C/_dynamo/guards.pyi) are updated to match the new function arg.

In [inductor/ir.py](https://github.com/pytorch/pytorch/blob/main/torch/_inductor/ir.py) extracts the operator name from the FX graph and passes it into the `codegen_size_asserts` and `codegen_alignment_asserts` functions, so that generated assertions in Triton code include the op name for better debugging.

Added unit tests inside [test_torchinductor.py](https://github.com/pytorch/pytorch/blob/main/test/inductor/test_torchinductor.py).
- Verified both successful and failing assertion cases include the operator name.
- Verified that generated Triton code contains the op name inside the asserts.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152353
Approved by: https://github.com/jansel, https://github.com/shunting314
2025-05-08 08:28:05 +00:00
748252378d [ca] introduce RuntimeState to support c++ hooks via graph breaks (#149987)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149987
Approved by: https://github.com/jansel
ghstack dependencies: #149647, #149709, #149651, #149897
2025-03-27 05:05:34 +00:00
f30776c37a [BE] Upgrade to mypy 1.14 (#145966)
Upgrade mypy version

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145966
Approved by: https://github.com/Skylion007
2025-03-04 20:58:26 +00:00
40b3e4a358 [dynamo] expose code execution strategy to python (#148020)
@anijain2305 this can be used to mark a code object to be skipped/run-only (recursively) while tracing.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148020
Approved by: https://github.com/jansel
2025-02-28 21:59:12 +00:00
63e8ad49b8 [dynamo] replace hardcoded eval frame control flags skip_code_recursive_flag/cache_limit_hit_flag (#146355)
This PR and the previous:
- Moves parts of `eval_frame.c` to C++.
- Reduces code duplication in `dynamo__custom_eval_frame` and makes the control flow more clear.
- Enables `convert_frame` to signal to `eval_frame.cpp` in a general manner how to evaluate this frame, recursive frames, and future frames with the same code object (default/compile, skip, run-only). e.g. this will allow us to change skipping/cache limit hit eval_frame behavior directly from convert_frame without requiring changes to C/C++.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146355
Approved by: https://github.com/jansel
ghstack dependencies: #145603
2025-02-18 21:37:12 +00:00
9dc702875d [dynamo][mappingproxy][inspect] Support existing types.MappingProxyType (#147217)
Fixes https://github.com/pytorch/pytorch/issues/147162

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147217
Approved by: https://github.com/williamwen42
2025-02-15 07:59:33 +00:00
015c6d6fdb [dynamo][guards] Turn on profiling of guard manager (#145420)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145420
Approved by: https://github.com/ezyang
ghstack dependencies: #145351
2025-01-23 18:17:43 +00:00
0efa843392 Dynamic shape guards in C++ (#139899)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/139899
Approved by: https://github.com/anijain2305, https://github.com/albanD, https://github.com/jansel
ghstack dependencies: #143385, #143164
2025-01-22 14:58:35 +00:00
5b5766665d PEP585 update - torch/_C torch/_decomp torch/_lazy torch/_library torch/_numpy torch/_prims torch/_refs torch/_strobelight (#145102)
See #145101 for details.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145102
Approved by: https://github.com/bobrenjc93
ghstack dependencies: #145105
2025-01-18 20:47:12 +00:00
80c286cbec remove allow-untyped-defs from torch/_C/_dynamo/eval_frame.pyi (#144655)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144655
Approved by: https://github.com/StrongerXi
2025-01-13 20:03:25 +00:00
06b4b96b34 dynamo tracing perf: no re in arg_ref: 33.9 -> 33.7 (#143069)
See #143056 for overall docs.

This PR: Avoid use of python re and move valid varname check in
`GuardBuilder.arg_ref()` into C++

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143069
Approved by: https://github.com/jansel
2024-12-23 05:32:09 +00:00
9bf4b1c2e9 dynamo tracing perf: c++ strip_function_call: 49.12 -> 47.77 (#143063)
See #143056 for overall docs.

This PR: Convert `strip_function_call()` into C++

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143063
Approved by: https://github.com/jansel
ghstack dependencies: #143057, #143062
2024-12-22 06:38:46 +00:00
18261e9f39 [dynamo] implement framelocals mapping as c++ object (#140063)
Implements https://github.com/pytorch/pytorch/issues/93753 - move frame local guard accessors to C++.

Before, we used dict accessors on a Python dict representing the frame's fastlocals that we manually build. We move this accessor to C++ and additionally use the fastlocal index whenever possible.

Some implementation notes:
- `FrameLocalsMapping` is now initialized as a C++ vector of `PyObject`s. We do not just use the frame's localsplus/fastlocals buffer because we also unbox cells.
- `FrameLocalsMapping` can still be converted into a Python dict representing the frame's fastlocals, but it is done lazily.
- We update `LeafGuard`, `GuardAccessor`, and `GuardManager`'s `check_nopybind` methods to accept `FrameLocalsMapping`. By default, we convert the `FrameLocalsMapping` to a Python dict and run the original `check_nopybind` on it, but in some cases, conversion is not needed.
- We add a new guard accessor `FrameLocalsGuardAccessor`, which is similar to `DictGetItemGuardAccessor` but has special handling for `FrameLocalsMapping`. We create a separate class to emphasize different use cases, but we could probably combine these two (can do in a follow up)

dynamo_guard_eval.py microbenchmark update:
- 713.2us -> 630.0us (3.10)
- 598.8us -> 530.7us (3.12)

Other followups:
- Add `FrameLocalsMapping` version for `check_verbose_nopybind` in order to match behavior between `check_nopybind` and `check_verbose_nopybind`. This can prevent difficult debugging situations where guards fail (`check_nopybind` returns false) but no guard error message is generated (`check_verbose_nopybind` succeeds).
- Rewrite the `SHAPE_ENV` guard into C++ - it is a fairly common guard that results in `FrameLocalsMapping` needing to convert to a dict

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140063
Approved by: https://github.com/jansel
ghstack dependencies: #142117, #142430
2024-12-17 18:54:27 +00:00
12d28a5929 Move overlapping guards to C++. (#140013)
This PR moves the logic for computing the overlapping relations between input tensors that
share a storage instance to C++.

In summary, this PR:

- Moves both `tensors_definitely_do_not_overlap` and part of `compute_overlapping_tensors`
to C++
- Introduces a `check_overlapping` function that re-runs `compute_overlapping_tensors`,
checking that the result is consistent with what is expected
- Introduces the `StorageOverlapChecker` class
    - Keeps track of overlapping and non-overlapping tensors
    - Actually checks the overlapping relation (call `check_overlapping`) when all tensors
    are collected
- Introduces the `STORAGE_OVERLAPPING` relational guard
    - Has a reference to a `StorageOverlapChecker`
    - Stores the to-be-checked tensors in the checker, and triggers its check
- Introduces `install_storage_overlapping_guard` python function
    - Creates an instance of `StorageOverlapChecker`
    - Creates 2 instances of the `STORAGE_OVERLAPPING` guard (for overlapping and
    non-overlapping tensors), referencing the same `StorageOverlapChecker` instance

**Why is `StorageOverlapChecker` needed?**

The way `GuardManager` is implemented, we have no control over the order in which the
check methods are called, i.e. no control over the order the tensors are collected. So, we
can't easily split them in "overlapping" and non-overlapping kinds.

Instead, we create 2 instances of `STORAGE_OVERLAPPING` guard, each of which helps
collecting the tensors for one of the kinds mentioned above. They are then used in a
single `StorageOverlapChecker` instance.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140013
Approved by: https://github.com/bdhirsh
ghstack dependencies: #139554, #139555
2024-12-05 14:43:58 +00:00
db4e8a1d8a [ca] expose option to collect sizes as dynamic (#141153)
This is to address recompiles from eager nodes that saved dynamic activations

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141153
Approved by: https://github.com/jansel
ghstack dependencies: #141152
2024-11-22 19:26:27 +00:00
fb529c2c84 [dynamo] skip_guard_eval_unsafe stance for power users (#140251)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140251
Approved by: https://github.com/jansel
ghstack dependencies: #140223, #140250
2024-11-21 06:28:58 +00:00
9d229f08f4 [dynamo][guards] Introduce a diff_guard_manager (#140250)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140250
Approved by: https://github.com/jansel
ghstack dependencies: #140223
2024-11-20 17:59:30 +00:00
f4ce9ac29d [dynamo] Dont erase the cache line on invalidation (#140821)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140821
Approved by: https://github.com/jansel
2024-11-19 19:11:10 +00:00
ac6684ebbc [dynamo] Identify pre-existing captured cells by cell id rather than content id (#140436)
In `match_nested_cell`, Dynamo tried to identify pre-existing captured
cells by `(cell_name, id(cell_contents))`. This works in most cases, but
as the test added in this patch shows, it's not a complete solution.

This patch
1. changes `match_nested_cell` to `lookup_variable_for_captured_cell`,
   and does the lookup based on id of cell objects, not their contents.
   This requires plumbing a tuple of captured cell objects from
   different CPython versions all the way to
   `InstructionTranslator.__init__`, where we store a mapping from the
   ids of these cell objects, and use it later in
   `UserFunctionVariable.bind_args` to look for these unboxed cells.
2. builds off (1) -- rather than using a `VariableTracker` that
   represents the content of the unboxed cells, use `ClosureVariable`,
   which enables codegen in case these cells escape as closure of a
   `NestedUserFunctionVariable`.

The patch adds a regression test for each of the scenarios above:
1. `test_write_to_cells_with_name_shadowing` where Dynamo mistakenly
   thought the program is writing to a cell captured by root frame (which
   it doesn't support atm), which resulted in
```
  File "/Users/ryanguo99/Documents/work/pytorch/torch/_dynamo/symbolic_convert.py", line 3340, in STORE_DEREF
    unimplemented("write to __closure__ while inlining")
  File "/Users/ryanguo99/Documents/work/pytorch/torch/_dynamo/exc.py", line 313, in unimplemented
    raise Unsupported(msg, case_name=case_name)
torch._dynamo.exc.Unsupported: write to __closure__ while inlining
```
2. `test_existing_func_that_creates_capturing_nested_func` where Dynamo
   ended up trying to codegen a `NestedUserFunctionVariable` that
   captures a cell which was also captured by the root frame, so it was
   unboxed and ends up emitting `LOAD_DEREF` rather than
   `LOAD_FAST/LOAD_CLOSURE` during codegen, resulting in
```
  File "/Users/ryanguo99/Documents/work/pytorch/torch/_dynamo/variables/functions.py", line 105, in _create_nested_fn
    func = FunctionType(code, f_globals, name, defaults, closure)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: arg 5 (closure) expected cell, found int
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140436
Approved by: https://github.com/jansel, https://github.com/williamwen42
ghstack dependencies: #140330, #140152
2024-11-15 17:17:30 +00:00
85dd7b84cf [dynamo] Add a DynamoFrameType type above Python frame object (#140330)
This patch introduces a `DynamoFrameType` to serve as a layer between
Dynamo and different versions of Python frame object. In
`DynamoFrameType`, we only register attributes Dynamo cares about (e.g.,
`f_code`, `f_locals`, etc.

This will be helpful when it comes to adding new attributes to this
`DynamoFrameType`, or dealing with Python version changes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140330
Approved by: https://github.com/jansel, https://github.com/williamwen42
2024-11-15 17:17:30 +00:00
e6c5a77485 [dynamo][guards] Profile guard manager in C++ (#140110)
This should remove the pybind noise from the profiling.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140110
Approved by: https://github.com/jansel
ghstack dependencies: #139953
2024-11-08 18:44:08 +00:00