Summary:
The changes contained in this diff
- allow subclass Minimizer implementations to override the default shape propagation logic with custom logic
- copies over the meta attribute on get_attr graph nodes during the graph splitting step
- for both changes, behavior for existing classes do not change
Test Plan: CI
Differential Revision: D70799942
Pull Request resolved: https://github.com/pytorch/pytorch/pull/148784
Approved by: https://github.com/blaine-rister
Summary:
During oncall, got a debug, where the error message is a bit ambiguous, due to multiple colons, and full line cutoff
```
AssertionError: Expected order: 1 for the component: remote_request_only to be >= 2, the max order for all its
```
Update the error message to something like
```
AssertionError: Component remote_request_only order must be >= max order of its upstream components, got component order=1 and max=2
```
Test Plan: CI
Differential Revision: D69482789
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146934
Approved by: https://github.com/ColinPeppler
Summary:
1. Added more details for some of the assert statements.
2. Moved assert statements to use tgif_assert
Test Plan: all unit tests should pass
Reviewed By: jingsh
Differential Revision: D67608251
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143771
Approved by: https://github.com/jingsh
For graphs with single output, the expectation of torch.export / torch.compile graph_module output type is a single torch.tensor instead of a tuple.
However, after using `_SplitterBase` partitioner on these graph_module (obtained from torch.export/torch.compile), the resultant graph module will return a tuple of tensors, in this case `(output,)`.
This PR adds codegen to the graphs produced by `_SplitterBase` partitioner. Setting this will ensure pytree unflatten nodes will be added automatically to handle unflattening of the output to return single outputs directly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120361
Approved by: https://github.com/angelayi
Summary:
Add Runtime Constant-folding for AOTInductor.
This also include the invocation of constant folding at load time.
The constant folding lowering is a 2-step process.
First, we split the graph into 2 modules, one of it is the constant module, which doesn't depend on any input and the whole module could be inferred (constant-folded) one-time and be reused. The constant module, is lowered, and being codegen-ed as usual and cached (let's call this constant code). The constant code reuses the whole lowering/profiling/etc. process, only difference is that we do not generate any headers or initialization for the constant code.
Second, after handling the constant module, we take care of the main module (which is the part that would depend on the user input.) For the main module, we take in one additional component, the constant code, compare with a normal lowering. Addition step we do here is that, we inject the constant code into the codegen-ed main module, and create the caller for the main module to consume the result of the constant module.
Test Plan: Unit tests included in commit.
Differential Revision: D53274382
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118765
Approved by: https://github.com/chenyang78
Summary: Currently `split_by_tags` determines submodule output order by iterating over `used_in_main`. Since this is a `Set`, insertion order is not retained so we run into problems with submodule output order being "randomized" & inconsistent between splits. By using `Dict[Node, None]` we can implement `used_in_main` as an ordered set so that output order is consistent when splitting the same model.
Test Plan: CI
Differential Revision: D39039268
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84136
Approved by: https://github.com/houseroad
Summary: If the model contains ModuleList, it's possible that we got some of the weight attributes as module.sub.0.weight. `getattr` doesn't work in this case and we have a dedicated function `getattrt_recursive` for that. Just use that.
Reviewed By: houseroad
Differential Revision: D37326955
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80011
Approved by: https://github.com/houseroad
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73519
Move `getattr_recursive()` and `setattr_recursive()` to fx main.
Test Plan: CI
Reviewed By: khabinov
Differential Revision: D34524723
fbshipit-source-id: a656e821d9dc1d446aa80cdc03a923bf0c05aeb5
(cherry picked from commit 4835965ac72d299487be14687823ea62394f4079)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57280
We've found an issue that fusion group would results in circular dependency. For example
```
a -> b -> c -> d
| ^
+ -------------+
Only a has non tensor output and currently we would create a fusion group (a, b, d). This results in circular dependency because now the fusion group depends on c while c depends on the fusion group as well.
```
This diff implement the solution discussed before. When we add a node to fusion group, we add all the nodes that are in the middle of the fusion group and this newly added node.
Use the same logic in minimizer to build fusion group.
Test Plan: split_tests and net_min_tests
Reviewed By: khabinov
Differential Revision: D27917432
fbshipit-source-id: a3d99fe5929dbc9f8eb0f45bccd83fd7b173795a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56201
Refactor Splitter and Minimizer to superclass `_SplitterBase` and `_MinimizerBase` and move them to OSS. This is needed to create an OSS example of GPU lowering with those tools.
Test Plan: CI
Reviewed By: jackm321
Differential Revision: D27629598
fbshipit-source-id: 0d4da02105ca509b31f1a6c4a39b1122c2bc7bf0