Summary: When performing constant folding, we must skip over operators that have symbolic `fill_value`.
Test Plan:
CI
Rollback Plan:
Reviewed By: kalpit-meta-1
Differential Revision: D80965936
Pull Request resolved: https://github.com/pytorch/pytorch/pull/161437
Approved by: https://github.com/StellarrZ
When replacing placeholders with getattrs during constant folding, we can have an argument and parameter name mismatch. In fact, there is no guarantee that the parameter name is equivalent to the argument name used in the module call.
Differential Revision: D72415970
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150692
Approved by: https://github.com/jfix71
Summary: Previously `_inline_module ` helper function only works with submodules that have args specified. This diff updates the util function to look for input arguments from submodule kwargs first using placeholder node names, then fallback to list of args if node name not found.
Test Plan:
```
buck2 run @//mode/{opt,mtia,inplace} //glow/fb/fx/fba/tests:test_fba_inductor -- -r test_connected_fusions
```
Differential Revision: D63347675
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136631
Approved by: https://github.com/jfix71
Fixes#124550
Also moves `graph.eliminate_dead_code()` call to a few lines after
`_inline_module(...)` in `const_fold.py`
* Test plan:
Add a new test on `test_eager_transforms.py` to ensure the reported
issue was indeed fixed
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133364
Approved by: https://github.com/zou3519
Summary: Improve the convert fp32 to fp16 fx pass to use to_dtype node and const folding instead of inplace conversion.
Test Plan:
```
buck2 test @//mode/{opt,inplace} //glow/fb/fx/fba/tests:test_fba_pass_manager_builder
```
Differential Revision: D57803843
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127829
Approved by: https://github.com/Skylion007
Summary:
The chunk/split ops on the weights/constants is folded in a fx pass and each output tensor has the same storage size of the original tensor (which is 3x of its actual size if chunk(3)). However Backend calculates the mem size on device from tensor shape/stride/dtype. This causes the mismatch when copying weights/constants to device as allocated mem on device is always smaller than the size of weights/constants and results in a runtime error in loading weight/constant (T172125529).
This diff fixes the issue by cloning the tensors after const folding so that the tensors has correct storage size.
Test Plan:
Before this change: (18432 = 48 * 64 * 2 * 3)
```
RuntimeError: Failed to load constant getitem_idx0 split (remaining=18432) at fbcode/caffe2/torch/fb/acc_runtime/afg/afg_bindings.cpp:3422: Request failed because an invalid parameter
```
```
buck2 run mode/opt //caffe2/torch/fb/acc_runtime/afg/tests:test_operators-artemis -- -r test_mem_size_mismatch
```
```
Ran 1 test in 7.048s
OK
```
Reviewed By: jfix71
Differential Revision: D56663931
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125199
Approved by: https://github.com/jfix71
Fixes https://github.com/pytorch/pytorch/issues/118129
Suppressions automatically added with
```
import re
with open("error_file.txt", "r") as f:
errors = f.readlines()
error_lines = {}
for error in errors:
match = re.match(r"(.*):(\d+):\d+: error:.*\[(.*)\]", error)
if match:
file_path, line_number, error_type = match.groups()
if file_path not in error_lines:
error_lines[file_path] = {}
error_lines[file_path][int(line_number)] = error_type
for file_path, lines in error_lines.items():
with open(file_path, "r") as f:
code = f.readlines()
for line_number, error_type in sorted(lines.items(), key=lambda x: x[0], reverse=True):
code[line_number - 1] = code[line_number - 1].rstrip() + f" # type: ignore[{error_type}]\n"
with open(file_path, "w") as f:
f.writelines(code)
```
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Co-authored-by: Catherine Lee <csl@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118533
Approved by: https://github.com/Skylion007, https://github.com/zou3519
Fixes https://github.com/pytorch/pytorch/issues/118129
Suppressions automatically added with
```
import re
with open("error_file.txt", "r") as f:
errors = f.readlines()
error_lines = {}
for error in errors:
match = re.match(r"(.*):(\d+):\d+: error:.*\[(.*)\]", error)
if match:
file_path, line_number, error_type = match.groups()
if file_path not in error_lines:
error_lines[file_path] = {}
error_lines[file_path][int(line_number)] = error_type
for file_path, lines in error_lines.items():
with open(file_path, "r") as f:
code = f.readlines()
for line_number, error_type in sorted(lines.items(), key=lambda x: x[0], reverse=True):
code[line_number - 1] = code[line_number - 1].rstrip() + f" # type: ignore[{error_type}]\n"
with open(file_path, "w") as f:
f.writelines(code)
```
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118533
Approved by: https://github.com/Skylion007, https://github.com/zou3519
This PR re-lands
- [Typing] Fix PEP 484 Violation (#105022)
- Update mypy to 1.4.1 (#91983)
That were reverted due to the conflict with internal source repo.
Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
- Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
- Add missing return statement to `torch._export. deserialize_graph`
- Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
- Add assert it `torch/optim/optimizer.py` that Optional list is not None
TODO (in followup PR):
- Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`
Unrelated, to bypass CI failures due to the gcc9 dependency update in Ubuntu-18.04:
- Add hack to squash older libstdc++ from conda environment in favor one from OS to `.ci/docker/install_conda.sh`
- Update bazel cuda builds to focal, as with libstdc++-6.0.32 bazel builds loose the ability to catch exceptions (probably because they link with cupti statically, but I could not found where it is done)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227
Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007
This PR re-lands
- [Typing] Fix PEP 484 Violation (#105022)
- Update mypy to 1.4.1 (#91983)
That were reverted due to the conflict with internal source repo.
Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
- Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
- Add missing return statement to `torch._export. deserialize_graph`
- Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
- Add assert it `torch/optim/optimizer.py` that Optional list is not None
TODO (in followup PR):
- Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227
Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007
Summary:
The current behavior of owning_module setter is difficult to understand: it changes the owning_module to None if owners is not 0 but increments the owners count. If the owning_module is None, the owners count should be 0 as none of them is accessible. On the other hand, if the owners count increases, the owning_module should be a collection (e.g. a list).
This diff changes owning_module to be a normal attribute. The semantic is that graph can have **at most one** owning module and can be assigned to new module.
The alternative is to use a list to represent the owning_modules of a graph but it breaks backward compatibility and the exact use cases of having multiple owning_modules are not clear.
Test Plan: Test with CI.
Differential Revision: D40200624
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86822
Approved by: https://github.com/tugsbayasgalan
Summary:
This is an un-backout but with a small change to set the default device `device_for_folded_attrs="cuda"` instead of `"cpu"`, which should avoid BC issues for TRT lowering.
Original commit changeset: 4ae1863e28ff
Original Phabricator Diff: D37192230 (24c2aff1b2)
Test Plan: Added unit test
Differential Revision: D37205432
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79696
Approved by: https://github.com/dborkovic
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68614
We need to copy modules over to the `split` graph during const folding. We were previously only doing so from the non-constant submod, but we need to do this for the constant one as well in case some `call_module` is const folded.
Test Plan: Added unit test
Reviewed By: wushirong, 842974287
Differential Revision: D32543289
fbshipit-source-id: 80d1d0ce2c18a665b00e1343d6c55d939390ab10
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67784
FX model generates quant/dequant layers for INT8 explicit mode support. However, if the inputs of quant/dequant layers are constant, the layer will be put into constant subgraph and optimized out. Hence TensorRT will fails to parse the left over graph. It is better to set up an optional function (skip_folding_node_fn) to skip folding nodes for split_const_subgraphs.
Reviewed By: jfix71
Differential Revision: D32076970
fbshipit-source-id: 7dcbb4f02386f8c831d09a2f0e40bcdba904471c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65933
We use `split_module` to split the input model that we want to const fold into const and non-const subgraphs. Previously we were taking the non-const graph and trying to hack it back into the same signature as the input model. However this was complex/buggy.
Instead, refactor to just keep using the base split module that contains both const and non-const graphs. This means we:
- Inline the non-const graph into the split module
- Remove the const graph from the module and replace it with a getattr that will be run to insert that attr when we `run_folding`
Test Plan: Added test coverage to cover newly supported folding, and updated other tests for new strategy.
Reviewed By: yinghai
Differential Revision: D31293307
fbshipit-source-id: 6e283a8c7222cf07b14e30e74dffc8ae5ee8b55f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64970
Add a helper function to create an unique name for an attr.
This can be used when we want to add a weight to a module.
Test Plan: run CI.
Reviewed By: jfix71
Differential Revision: D30921497
fbshipit-source-id: 598569d107df8b516ff12920a4bef3a42577e987
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65223
If there're unused inputs, they won't appear in `submod_1`. We need to add all the unused inputs so that the model after const fold has the same inputs as the original model.
Reviewed By: jfix71
Differential Revision: D31021217
fbshipit-source-id: b7452c90d133b747e0699936a81d3fee14af9cc9
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65159
This was a legit fix originally introduced in D30905949 (446d95a7f6). But we hesitated and removed it for some reason. Putting it back.
Reviewed By: 842974287
Differential Revision: D30996277
fbshipit-source-id: 3f5eede11dba2072e7cd5ae6ca7ac81d55fb75fa
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64945
In the const folding pass, we try to create `get_attr` nodes in submod_1 for `get_attr` nodes that are in the main graph. But we don't have the real attributes in submod_1. To fix this we assign main module as the owning module of sumod_1 graph.
The fix above would cause problem for `call_module` node in submod_1 because during split modules gets inlined (target changed from "mod.a.b" -> "mod_a_b") to submod_1. Changing the owning module would make those `call_module nodes unable to find the referring module. To fix this, we set the targeting module to main module.
Reviewed By: jfix71
Differential Revision: D30905949
fbshipit-source-id: cd67bc8fe4b8ad4344ae97b8e36753fdce3ece6d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64782
Previously, get_attrs that were added to the graph did not retain node.meta after folding. Add such support, and improve coverage in general here.
Test Plan: Added test coverage.
Reviewed By: protonu
Differential Revision: D30852704
fbshipit-source-id: ece87a61c69b2e68982964c6adc4dde14dae12c7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64342
Previously we weren't handling the case where an attribute was in a module that wasn't the root.
Test Plan: Added unit test coverage.
Reviewed By: yinghai
Differential Revision: D30691730
fbshipit-source-id: b39b5cf748c4c882f315a4f32b51ad88cc7a43ed
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48443
Add a constant folding pass in FX:
- Iterate over an input graph and tag what nodes are fully constant, i.e. either `get_attr` nodes, or nodes with all inputs that are either `get_attr` or constant
- Use `model_transform.split_by_tags()` to split the graph into two
- Look for the `output` node in the constant graph to get names of attrs that will be folded
- Iterate over the non-constant graph and replace placeholders that are using the same name as the attrs with a `get_attr` as well as a dummy attr on the module
- Return these two graphs in a new `FoldedGraphModule`, which is a normal GraphModule but also stores the constant graph on the side along with a `run_folding()` method that will run const folding and update the dummy parameters with the actual folded parameters
Test Plan: Added a couple tests
Reviewed By: 842974287
Differential Revision: D25033996
fbshipit-source-id: 589c036751ea91bb8155d9be98af7dbc0552ea19