123 Commits

Author SHA1 Message Date
f15c25d5c3 [user-streams] Move stream code to streams module (#163027)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163027
Approved by: https://github.com/StrongerXi, https://github.com/anijain2305
2025-10-14 05:43:19 +00:00
1191e51c44 [dynamo][annotate] Remove the need of external ctx mgr of preserve_node_meta (#165188)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165188
Approved by: https://github.com/yushangdi
2025-10-13 22:22:20 +00:00
a19123b37e Revert "[dynamo][annotate] Remove the need of external ctx mgr of preserve_node_meta (#165188)"
This reverts commit f0325d07876b5a52d29a44ee02dcf7a7c21b258a.

Reverted https://github.com/pytorch/pytorch/pull/165188 on behalf of https://github.com/malfet due to Looks like it broke bunch of tests, see 2d4654d208/1 ([comment](https://github.com/pytorch/pytorch/pull/165188#issuecomment-3393674273))
2025-10-11 21:38:45 +00:00
f0325d0787 [dynamo][annotate] Remove the need of external ctx mgr of preserve_node_meta (#165188)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165188
Approved by: https://github.com/yushangdi
ghstack dependencies: #164776
2025-10-11 15:49:42 +00:00
4308b8a28f [dynamo] Support torch.fx.traceback.annotate (#164678)
Builds on top of https://github.com/pytorch/pytorch/pull/163673 and https://github.com/pytorch/pytorch/pull/164174. This will be used in the followup PRs to apply regional inductor compilation.

The existing implementation let Dynamo trace into the `torch.fx.traceback.annotate`, but thats not what we want. We want Dynamo to essentially run the torch.fx.traceback.annotate function in eager, so that every Fx node created in Dynamo Fx graph has the custom meta node.

What does not work?
* We still have to set the context manager `torch.fx.traceback.preserve_node_meta()` in the user code because CI was unhappy. This can be fixed but with some perseverance.
* This does not work with graph breaks yet. But we can solve that problem, if needed, in a separate PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164678
Approved by: https://github.com/SherlockNoMad, https://github.com/jansel, https://github.com/xmfan
2025-10-08 22:41:00 +00:00
3040a5d294 Revert "[dynamo] Support torch.fx.traceback.annotate (#164678)"
This reverts commit 801e282f39e9ef4424dfd3ecfd2b550a44595229.

Reverted https://github.com/pytorch/pytorch/pull/164678 on behalf of https://github.com/izaitsevfb due to breaks executorch internally, see [D84068062](https://www.internalfb.com/diff/D84068062?entry_point=16) ([comment](https://github.com/pytorch/pytorch/pull/164678#issuecomment-3379281844))
2025-10-08 01:49:34 +00:00
801e282f39 [dynamo] Support torch.fx.traceback.annotate (#164678)
Builds on top of https://github.com/pytorch/pytorch/pull/163673 and https://github.com/pytorch/pytorch/pull/164174. This will be used in the followup PRs to apply regional inductor compilation.

The existing implementation let Dynamo trace into the `torch.fx.traceback.annotate`, but thats not what we want. We want Dynamo to essentially run the torch.fx.traceback.annotate function in eager, so that every Fx node created in Dynamo Fx graph has the custom meta node.

This does not work with graph breaks yet. But we can solve that problem, if needed, in a separate PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164678
Approved by: https://github.com/SherlockNoMad, https://github.com/jansel, https://github.com/xmfan
2025-10-07 14:54:26 +00:00
cfc5cc17dc Revert "[dynamo] Support torch.fx.traceback.annotate (#164678)"
This reverts commit 2883b5ab773daf5861d43ff0b65be49a441ab3f9.

Reverted https://github.com/pytorch/pytorch/pull/164678 on behalf of https://github.com/izaitsevfb due to fails inductor:max_autotune tests internally, see D83948169 ([comment](https://github.com/pytorch/pytorch/pull/164678#issuecomment-3374407009))
2025-10-06 22:03:42 +00:00
2883b5ab77 [dynamo] Support torch.fx.traceback.annotate (#164678)
Builds on top of https://github.com/pytorch/pytorch/pull/163673 and https://github.com/pytorch/pytorch/pull/164174. This will be used in the followup PRs to apply regional inductor compilation.

The existing implementation let Dynamo trace into the `torch.fx.traceback.annotate`, but thats not what we want. We want Dynamo to essentially run the torch.fx.traceback.annotate function in eager, so that every Fx node created in Dynamo Fx graph has the custom meta node.

This does not work with graph breaks yet. But we can solve that problem, if needed, in a separate PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164678
Approved by: https://github.com/SherlockNoMad, https://github.com/jansel, https://github.com/xmfan
2025-10-06 02:59:24 +00:00
1c9987fdf4 [dynamo, 3.14] fix context managers (#163109)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163109
Approved by: https://github.com/anijain2305, https://github.com/mlazos
ghstack dependencies: #161838, #161555, #161839, #163009
2025-09-30 17:42:03 +00:00
8678d831c4 [dynamo] rename set_fullgraph to error_on_graph_break (#161739)
Renaming `set_fullgraph` to `error_on_graph_break` for now. There are no semantic differences yet. In a followup PR, we will introduce a new `torch.compile` option `error_on_graph_break` that has lower priority than `fullgraph` so that `fullgraph` really returns 1 graph.

I could keep `set_fullgraph` as a deprecated alias for `error_on_graph_break` for now, but I'm hoping that won't be necessary since it's still private API (there are no internal callsites yet, and there are no significant OSS callsites yet).

 cc @albanD @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames @Lucaskabela @mlazos @guilhermeleobas @xmfan as primary users for `set_fullgraph`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161739
Approved by: https://github.com/xmfan, https://github.com/Lucaskabela, https://github.com/anijain2305, https://github.com/mlazos
2025-09-04 01:15:06 +00:00
dbef606631 Add support for tracing vmap in pre-dispatch export (#154650)
Summary: ONNX team and recent transformer upgrade ran into this error and we also ran into during our export benchmarking. This diff makes it possible to trace through vmap implementation in pre-dispatch IR. Note that we don't support serializing functorch ops in pre-dispatch IR and in the future, we should desugar them to post-grad ops.

The implementation strategy is:
1. We add python wrappers around vmap APIs so that we attach custom torch function handler that is only on during non-strict export. The reason is we don't want to add this to default torch_function handler because it will break BC.
2. Some dynamo changes to make sure it picks up new python wrapper APIs. The reason is when we do strict export, we need to re-materialize these APIs in pre-dispatch IR from torch IR. We can avoid this by special casing in dynamo for export to proxy different API calls but i feel that is too much chaos because you need to be able to proxy 2 different variants of same vmap API.

Test Plan: CI

Differential Revision: D75623875

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154650
Approved by: https://github.com/ezyang, https://github.com/zou3519
2025-08-20 19:31:07 +00:00
3faee0a631 Update nullcontext to return input args (#158776)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158776
Approved by: https://github.com/zou3519
2025-08-14 03:02:44 +00:00
f5e2de928b [BE] fix remaining flake8 v7 warnings (#159044)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159044
Approved by: https://github.com/Skylion007
ghstack dependencies: #159043
2025-07-25 02:56:34 +00:00
836bb1941b [hop] support torch.func.functional_call in hop subgraph (#155886)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155886
Approved by: https://github.com/zou3519
2025-06-28 23:47:46 +00:00
dcb8982969 [dynamo] move error_on_graph_break out of config (#156762)
error_on_graph_break doesn't need to be in config, so we move it out. It should make the functorch_maml_omniglot regression less severe.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156762
Approved by: https://github.com/jansel
ghstack dependencies: #154283, #154289, #154782
2025-06-26 21:40:38 +00:00
1b2146fc6d [BE][4/16] fix typos in torch/ (torch/_dynamo/) (#156314)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156314
Approved by: https://github.com/jingsh
ghstack dependencies: #156313
2025-06-23 02:57:19 +00:00
5b427c92a8 Revert "[BE][4/16] fix typos in torch/ (torch/_dynamo/) (#156314)"
This reverts commit ead741c5fb0036e0fc95b79d4fe1af3a426e1306.

Reverted https://github.com/pytorch/pytorch/pull/156314 on behalf of https://github.com/atalman due to export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_input_aliasing_contents_backend_aot_eager [GH job link](https://github.com/pytorch/pytorch/actions/runs/15804799771/job/44548489912) [HUD commit link](c95f7fa874) ([comment](https://github.com/pytorch/pytorch/pull/156313#issuecomment-2994171213))
2025-06-22 12:31:57 +00:00
ead741c5fb [BE][4/16] fix typos in torch/ (torch/_dynamo/) (#156314)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156314
Approved by: https://github.com/jingsh
ghstack dependencies: #156313
2025-06-22 08:43:18 +00:00
54f1f29fed [dynamo] dynamic gb_type -> static gb_type (#154435)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/154435
Approved by: https://github.com/williamwen42
2025-05-28 03:14:26 +00:00
15a3f58f91 Return ConstantVariable(None) from WithExitFunctionVariable.exit to prevent NoneType crash inside autocast exception path (#152503)
Copy of #152013 with PR time benchmarks updated (regressions seem unrelated)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152503
Approved by: https://github.com/anijain2305, https://github.com/Skylion007

Co-authored-by: Witold Dziurdz <wdziurdz@habana.ai>
2025-05-01 04:01:24 +00:00
5b9df57b50 [dynamo] context manager/decorator for dynamo config patching during tracing (#150586)
Implement traceable config patching for Dynamo: enables restricted patching of Dynamo config where user can use a context manager/decorator to change tracing behavior for parts of the code.

The new `dont_skip_tracing` decorator/context manager for ignoring most trace rules is easily implemented with this more generic traceable config patching feature.

Implementation:
- Create a new specialized context manager class representing a wrapper around torch._dynamo.config.patch
- Dynamo doesn't trace into the context manager but updates config at compile time
- Correctness is based on our correctness for handling supported context managers
- Implementation is inspired by how `GradModeVariable` is implemented.

Previous attempts: https://github.com/pytorch/pytorch/pull/148736 (decorator-only global approach) and https://github.com/pytorch/pytorch/pull/149439 (decorator-only traceback approach)

See https://docs.google.com/document/d/1vWNwKL_jpg-PLopifcaSa338wks3GqSVF4GHRguybGg/edit?tab=t.0 for more details on implementation - including previous approaches.

NOTE: this PR fixes a bug where skipped code objects were not tracked by convert_frame.py, leading to cases where code objects would be automatically skipped even after `torch._dynamo.reset()`. This exposed some latent dynamo-wrapped test failures in CI that previously passed in CI but not locally.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150586
Approved by: https://github.com/jansel, https://github.com/zou3519, https://github.com/anijain2305
2025-04-23 09:12:13 +00:00
c312d8c501 [Dynamo] Clean up old torch function flag (#149711)
This is tracked via `SymbolicTorchFunctionState` now.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149711
Approved by: https://github.com/StrongerXi, https://github.com/anijain2305
2025-04-21 21:33:58 +00:00
6a3a6d22dc Revert "[dynamo] context manager/decorator for dynamo config patching during tracing (#150586)"
This reverts commit 40ce4fb24a536d175348df876f61956d4945778e.

Reverted https://github.com/pytorch/pytorch/pull/150586 on behalf of https://github.com/clee2000 due to broke some inductor tests? inductor/test_fuzzer.py::TestConfigFuzzer::test_config_fuzzer_dynamo_bisect [GH job link](https://github.com/pytorch/pytorch/actions/runs/14486513628/job/40635178179) [HUD commit link](40ce4fb24a), bad TD ([comment](https://github.com/pytorch/pytorch/pull/150586#issuecomment-2810064322))
2025-04-16 16:13:47 +00:00
40ce4fb24a [dynamo] context manager/decorator for dynamo config patching during tracing (#150586)
Implement traceable config patching for Dynamo: enables restricted patching of Dynamo config where user can use a context manager/decorator to change tracing behavior for parts of the code.

The new `dont_skip_tracing` decorator/context manager for ignoring most trace rules is easily implemented with this more generic traceable config patching feature.

Implementation:
- Create a new specialized context manager class representing a wrapper around torch._dynamo.config.patch
- Dynamo doesn't trace into the context manager but updates config at compile time
- Correctness is based on our correctness for handling supported context managers
- Implementation is inspired by how `GradModeVariable` is implemented.

Previous attempts: https://github.com/pytorch/pytorch/pull/148736 (decorator-only global approach) and https://github.com/pytorch/pytorch/pull/149439 (decorator-only traceback approach)

See https://docs.google.com/document/d/1vWNwKL_jpg-PLopifcaSa338wks3GqSVF4GHRguybGg/edit?tab=t.0 for more details on implementation - including previous approaches.

NOTE: this PR fixes a bug where skipped code objects were not tracked by convert_frame.py, leading to cases where code objects would be automatically skipped even after `torch._dynamo.reset()`. This exposed some latent dynamo-wrapped test failures in CI that previously passed in CI but not locally.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150586
Approved by: https://github.com/jansel, https://github.com/zou3519, https://github.com/anijain2305
2025-04-16 06:49:58 +00:00
a2f7764507 [Dynamo] Fix the unimplemented_v2 of EventVariable.call_method in ctx_manager.py (#151208)
Changes:
- Field of `explanations` shoule be `str` instead of `tuple`
- Not only `torch.cuda.Event`, but alse `torch.xpu.Event` can trigger this message.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151208
Approved by: https://github.com/Skylion007
2025-04-15 05:26:39 +00:00
97759614c2 [dynamo] reconstruct functions decorated in the compiled region properly (#150645)
We were previously unable to reconstruct functions that were decorated in the compiled region.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150645
Approved by: https://github.com/jansel
2025-04-08 17:32:46 +00:00
1b0a023dde [Dynamo][Misc] Apply typing hints for codegen (#150289)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150289
Approved by: https://github.com/Skylion007, https://github.com/cyyever
2025-04-04 14:26:22 +00:00
34743678b9 [Dynamo] Cleanup state management for ctx managers (#149689)
Removes state indirection for ctx managers. This isn't needed anymore since VTs are mutable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149689
Approved by: https://github.com/StrongerXi
2025-03-21 07:18:33 +00:00
1d3c50fcc5 [Dynamo] Support the torch._C.DisableTorchFunction ctx manager (#149491)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149491
Approved by: https://github.com/StrongerXi
ghstack dependencies: #149489, #149490
2025-03-20 22:19:55 +00:00
f64c361860 [Dynamo] Refactor DisableTorchFunction ctx manager (#149489)
Refactors the DisableTorchFunction ctx manager to properly model the eager code (no args to the context manager).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149489
Approved by: https://github.com/StrongerXi
2025-03-20 22:19:55 +00:00
a3c286677b [compile] Switch off inference mode during compilation (#149321)
PR does following
* Turns `inference_mode` to False and `no_grad` for `convert_frame`, if the inference_mode is on globally.
* Turns off inference_mode for fake tensor prop. This ensures that converting from real inference tensor to a fake tensor removes the inference-ness.
* Graph breaks on is_inference and is_inference_mode_enabled.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149321
Approved by: https://github.com/jansel, https://github.com/zou3519
2025-03-19 02:45:27 +00:00
0dcd482e54 [SDPA] Respect sdpa_kernel's priority_order setting in torch.compile (#147768)
[https://github.com/pytorch/pytorch/pull/140467](https://github.com/pytorch/pytorch/pull/140467) added the option to specify a priority order for SDPA but the `torch.compile` path silently ignored this setting as I wasn't aware of the separate context manager handling on `torch.compile`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147768
Approved by: https://github.com/drisspg
2025-03-13 18:52:34 +00:00
fb53e9e514 Add __context/cause/suppress_context/traceback__ to Exception (#146499)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146499
Approved by: https://github.com/zou3519, https://github.com/anijain2305
ghstack dependencies: #146504
2025-03-11 18:55:45 +00:00
79aa17489c [dynamo] ctx_manager.py: replace unimplemented with unimplemented_v2 (#148570)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/148570
Approved by: https://github.com/williamwen42
ghstack dependencies: #148454
2025-03-06 07:46:31 +00:00
baba7beed2 [dynamo] add context manager debug information to graph breaks (#147872)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147872
Approved by: https://github.com/zou3519
ghstack dependencies: #147494
2025-02-28 06:23:28 +00:00
3ce352e389 [BE][PYFMT] migrate PYFMT for torch._dynamo to ruff format (#144549)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144549
Approved by: https://github.com/jansel
2025-02-28 03:03:53 +00:00
21c2565f35 Document dynamo (#146736)
Many files in dynamo are currently lacking file/module-level documentation, which makes it hard to know what they do at a glance and without digging into the code. This fixes that.

Note: documentation was AI-generated and could be incorrect, please review carefully.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146736
Approved by: https://github.com/jansel, https://github.com/StrongerXi, https://github.com/anijain2305, https://github.com/zou3519
2025-02-13 00:02:21 +00:00
e2e265e27b [dynamo] Use polyfill to implement comparison operators (#144485)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144485
Approved by: https://github.com/jansel
2025-02-06 17:27:07 +00:00
e68f5087d8 update _unsafe_set_version_counter to accept lists of tensors (#137921)
See the comment [here](https://github.com/pytorch/pytorch/issues/132014#issuecomment-2379547400) (cc @H-Huang @awgu @kwen2501 @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @c-p-i-o @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov @XilunWu @rec) - this PR updates `_unsafe_set_version_counter` to accept a list of tensors, for overhead-sensitive users (e.g. distributed) who need to hide VC bumps from autograd on a large list of tensors without wanting to suffer the overhead of going from python->C++ separately for every tensor in the list.

I left the binding in pybind, and used a `std::vector`. if we **really** need to optimize overhead even further, we could write a manual cpython binding.

I use this updated API in the next PR to fix FSDP2, so that it properly hides the VC of all `all_gather_buffer` tensors in its call to `split_with_sizes_copy.out(all_gather_buffers)`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137921
Approved by: https://github.com/awgu, https://github.com/albanD
2025-02-04 04:51:11 +00:00
15e12d5ec3 [Trace PyDispatcher] Support temporarily_pop_interpreter_stack ctx manager (#146271)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146271
Approved by: https://github.com/zou3519
ghstack dependencies: #146270
2025-02-03 21:47:54 +00:00
1185b81c51 Revert "[dynamo] Use polyfill to implement comparison operators (#144485)"
This reverts commit d1f82de2bf4ce4d4461791a9c9b2e759202db0bb.

Reverted https://github.com/pytorch/pytorch/pull/144485 on behalf of https://github.com/huydhn due to This seems to break dynamo tests in trunk after landing ([comment](https://github.com/pytorch/pytorch/pull/144485#issuecomment-2622893294))
2025-01-29 21:30:42 +00:00
d1f82de2bf [dynamo] Use polyfill to implement comparison operators (#144485)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144485
Approved by: https://github.com/jansel
2025-01-29 17:37:40 +00:00
5a4d959cdb [dynamo] Properly model torch profiler context objects (#145537)
Prior to this patch, Dynamo conveniently modelled torch profiler context
objects (e.g., `torch.profiler.profile`) as `NullContextVariable`
because `torch.compile` ignore the effect of these profiler contexts.

However, the semantics of these profiler contexts diverges from
`contextlib.nullcontext` in the `__enter__` function, where the former
returns `self` and the latter returns `None`. This causes subtle error
as observed in #125021.

This patch adds back a `ProfilerContextVariable`, which addresses the
aforementioned semantic discrepency.

Fixes #125021.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145537
Approved by: https://github.com/zou3519, https://github.com/williamwen42
2025-01-28 00:03:36 +00:00
a79100ab11 PEP585 update - torch/_dynamo (#145105)
See #145101 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145105
Approved by: https://github.com/bobrenjc93
2025-01-18 20:47:11 +00:00
78425bff30 [FSDP2] Move to public torch.distributed.fsdp (#141868)
**Overview**
This PR moves `torch/distributed/_composable/fsdp` to `torch/distributed/fsdp/_fully_shard` and makes public APIs available from `torch.distributed.fsdp`, e.g.:
```
from torch.distributed.fsdp import fully_shard
```
This is targeting 2.6 release. I rewrote some of the documentation with (hopefully) improved phrasing.

**Changes for Reland**
- Preserved the public objects from `torch/distributed/_composable/fsdp/fully_shard.py` so that the import path still works internally
- Added a unit test that we can do `from torch.distributed._composable.fsdp.fully_shard import FSDPModule`

Differential Revision: [D66890387](https://our.internmc.facebook.com/intern/diff/D66890387)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/141868
Approved by: https://github.com/kwen2501, https://github.com/wconstab, https://github.com/weifengpy, https://github.com/fegin, https://github.com/XilunWu

Co-authored-by: Svetlana Karslioglu <svekars@meta.com>
2024-12-07 01:24:28 +00:00
bab15df40a Revert "[FSDP2] Move to public torch.distributed.fsdp (#141868)"
This reverts commit 45583a5df907a7948693c047e5fe2c8349622069.

Reverted https://github.com/pytorch/pytorch/pull/141868 on behalf of https://github.com/atalman due to failing internally ([comment](https://github.com/pytorch/pytorch/pull/141868#issuecomment-2523925180))
2024-12-06 18:38:12 +00:00
45583a5df9 [FSDP2] Move to public torch.distributed.fsdp (#141868)
**Overview**
This PR moves `torch/distributed/_composable/fsdp` to `torch/distributed/fsdp/_fully_shard` and makes public APIs available from `torch.distributed.fsdp`, e.g.:
```
from torch.distributed.fsdp import fully_shard
```
This is targeting 2.6 release. I rewrote some of the documentation with (hopefully) improved phrasing.

**Follow-Ups**
- [x] Add some explanation in the docs about FSDP1 vs. FSDP2
- [ ] Move unit tests from `test/distributed/_composable/fsdp` to `test/distributed/fsdp/fully_shard/`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141868
Approved by: https://github.com/kwen2501, https://github.com/wconstab, https://github.com/weifengpy

Co-authored-by: Svetlana Karslioglu <svekars@meta.com>
2024-12-05 03:04:01 +00:00
12e95aa4ee [BE]: Apply PERF401 autofixes from ruff (#140980)
* Automatically applies ruff rule 401. Turns loops into equivalent list comprehensions which are faster and do not leak the scope of the loop variables.
* list comprehensions not only often have better typing, but are 50+% faster than for loops on overhead. They also preserve length information etc and are better for the interpreter to optimize.
* Manually went back and made mypy happy after the change.
* Also fixed style lints in files covered by flake8 but not by pyfmt

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140980
Approved by: https://github.com/justinchuby, https://github.com/malfet
2024-11-20 17:52:07 +00:00
e41dffbedd [Dynamo] Trace enter/exit of TorchFunctionModes (#135422) (#137114)
This PR implements tracing of with contexts with TorchFunction modes which have the default enter/exit behavior (ie pushing/popping the mode)

Typically the bytecode for a context manager looks like this during a graph break:
1. graph call
2. enter context
3. unsupported code
4. exit context
5. resume call

resume fn structure:
1. enter context
2. jump
...
3. exit context

The issue with torch function modes is that side effects will replay any mutations to the torch function stack performed during tracing. So, we do not need to enter and exit around the unsupported code in the original function (doing so would result in a duplicate torch function mode entry during execution of the unsupported code), and we don't need to enter again in the resume function (the mode that was pushed from the side effects bytecode would still be on the stack).

So for torch function modes the structure of our output code is this:

1. graph call
2. mutate tf mode stack to replay mutations
4. unsupported code
5. on exception restore stack
6. resume function

Then our resume fn looks like this:

1. no-op enter torch function mode
2. jump
3.  exit tf mode

To implement the no-op enter of the torch function mode I added torch function mode in polyfill which no-op enters, but normally exits. This is needed because we still want to trace the with context in the resume function, and exit properly (the exit instructions will still be in the function, so we need to generate instructions to set up the context).

Separately from the bytecode, dynamo also tracks contexts on the block stack, which is how the SETUP_* instructions are implemented. Naturally at a graph break, we exit these block stacks to properly reset the contexts entirely, so that we can re-enter around the unsupported code soundly. However once again, in the torch function mode case, in the event of a graph we do not want to perform any exit side effects because we want to preserve the state of the mode stack as is so that we will properly update the stack with bytecode mentioned in the first section. If we exited here, dynamo would pop the mode off of the symbolic stack, and not update the true python torch function mode stack with the suffix bytecode. All in all, for torch function modes we enter exactly once, update the global torch function mode stack with side effects bytecode, re-read this stack when compiling the resume function, and exit exactly once in the resume function. This matches the semantics of eager exactly.
Approved by: https://github.com/williamwen42
ghstack dependencies: #134732, #133137, #135443, #135444

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137114
Approved by: https://github.com/yanboliang
2024-10-09 02:29:40 +00:00