* Automatically applies ruff rule 401. Turns loops into equivalent list comprehensions which are faster and do not leak the scope of the loop variables.
* list comprehensions not only often have better typing, but are 50+% faster than for loops on overhead. They also preserve length information etc and are better for the interpreter to optimize.
* Manually went back and made mypy happy after the change.
* Also fixed style lints in files covered by flake8 but not by pyfmt
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140980
Approved by: https://github.com/justinchuby, https://github.com/malfet
Fixes https://github.com/pytorch/pytorch/issues/118129
Suppressions automatically added with
```
import re
with open("error_file.txt", "r") as f:
errors = f.readlines()
error_lines = {}
for error in errors:
match = re.match(r"(.*):(\d+):\d+: error:.*\[(.*)\]", error)
if match:
file_path, line_number, error_type = match.groups()
if file_path not in error_lines:
error_lines[file_path] = {}
error_lines[file_path][int(line_number)] = error_type
for file_path, lines in error_lines.items():
with open(file_path, "r") as f:
code = f.readlines()
for line_number, error_type in sorted(lines.items(), key=lambda x: x[0], reverse=True):
code[line_number - 1] = code[line_number - 1].rstrip() + f" # type: ignore[{error_type}]\n"
with open(file_path, "w") as f:
f.writelines(code)
```
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Co-authored-by: Catherine Lee <csl@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118533
Approved by: https://github.com/Skylion007, https://github.com/zou3519
Fixes https://github.com/pytorch/pytorch/issues/118129
Suppressions automatically added with
```
import re
with open("error_file.txt", "r") as f:
errors = f.readlines()
error_lines = {}
for error in errors:
match = re.match(r"(.*):(\d+):\d+: error:.*\[(.*)\]", error)
if match:
file_path, line_number, error_type = match.groups()
if file_path not in error_lines:
error_lines[file_path] = {}
error_lines[file_path][int(line_number)] = error_type
for file_path, lines in error_lines.items():
with open(file_path, "r") as f:
code = f.readlines()
for line_number, error_type in sorted(lines.items(), key=lambda x: x[0], reverse=True):
code[line_number - 1] = code[line_number - 1].rstrip() + f" # type: ignore[{error_type}]\n"
with open(file_path, "w") as f:
f.writelines(code)
```
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118533
Approved by: https://github.com/Skylion007, https://github.com/zou3519
Applies PLW0108 which removes useless lambda calls in Python, the rule is in preview so it is not ready to be enabled by default just yet. These are the autofixes from the rule.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113602
Approved by: https://github.com/albanD
Applies the remaining flake8-comprehension fixes and checks. This changes replace all remaining unnecessary generator expressions with list/dict/set comprehensions which are more succinct, performant, and better supported by our torch.jit compiler. It also removes useless generators such as 'set(a for a in b)`, resolving it into just the set call.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94676
Approved by: https://github.com/ezyang
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60370
When creating a single parition skip the output nodes, but process possible nodes after it.
Test Plan: Run all CI tests.
Reviewed By: jfix71
Differential Revision: D29265278
fbshipit-source-id: 2242009973a54498d8027cce5a294558a1206fdf
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60064
This implements a host saturation optimization to maximize the utilization of the available devices.
It uses a greedy heuristic to replicate all partitions on the used devices to another set of idle devices with enough memory.
The added unittest shows an example as follows:
```
partition_0: 192 bytes; partition_1: 48 bytes
dev_0: 200 bytes, [partition_0]
dev_1: 200 bytes, [partition_1]
dev_2: 100 bytes,
dev_3: 100 bytes,
dev_4: 200 bytes,
dev_5: 100 bytes
```
Before host saturation, `partition_0` is assigned to dev_0 and `partition_1` is assigned to dev_1.
After host saturation, `partition_0` is replicated to dev_4 simply because it's the only device that can hold all partitions on dev_0. `partition_1` is replicated to dev_2 because it has minimal but large enough memory to hold all partitions on dev_1.
Test Plan:
```
buck test mode/opt //caffe2/test:test_fx_experimental -- --exact 'caffe2/test:test_fx_experimental - test_saturate_host (test_fx_experimental.TestFXExperimental)'
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/8444249343103429
✓ ListingSuccess: caffe2/test:test_fx_experimental - main (1.322)
✓ Pass: caffe2/test:test_fx_experimental - test_saturate_host (test_fx_experimental.TestFXExperimental) (1.322)
Summary
Pass: 1
ListingSuccess: 1
```
An e2e test will be added to `test_fx_glow.py` in a followup diff.
Reviewed By: gcatron
Differential Revision: D29039998
fbshipit-source-id: 57518aadf668f7f05abd6ff73224c16b5d2a12ac
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60056
Previously we put the whole graph as a single partition onto a device with maximum memory if possible, but the code assumed that the first logical device always has the maximum memory.
This diff fixes this issue and updates the unittest to reflect such a corner case.
Test Plan:
```
buck test mode/opt //caffe2/test:test_fx_experimental -- --exact 'caffe2/test:test_fx_experimental - test_find_single_partition (test_fx_experimental.TestFXExperimental)'
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/6473924507772744
✓ ListingSuccess: caffe2/test:test_fx_experimental - main (1.357)
✓ Pass: caffe2/test:test_fx_experimental - test_find_single_partition (test_fx_experimental.TestFXExperimental) (1.206)
Summary
Pass: 1
ListingSuccess: 1
```
Reviewed By: gcatron
Differential Revision: D29118715
fbshipit-source-id: cac6a1f0d2f47717446dcc80093bbcf362663859
Summary:
This PR add supports on AOT based partition. Given each node and its corresponding partition id, generate the partition, submodules and dag
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48336
Reviewed By: gcatron
Differential Revision: D25226899
Pulled By: scottxu0730
fbshipit-source-id: 8afab234afae67c6fd48e958a42b614f730a61d9
Summary:
This is a partition search based on Kernighan-Lin algorithm. First, the graph is partitioned using size_based_partition, then nodes from different partitions are swapped until the cost reaches minimum.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48197
Reviewed By: gcatron
Differential Revision: D25097065
Pulled By: scottxu0730
fbshipit-source-id: 3a11286bf4e5a712ab2848b92d0b98cd3d6a89be
Summary:
This PR fixes the add_node and remove_node in partition class and also add a unit test for node manipulation in partition
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48016
Reviewed By: gcatron
Differential Revision: D24996368
Pulled By: scottxu0730
fbshipit-source-id: 0ddffd5ed3f95e5285fffcaee8c4b671929b4df3
Summary:
Change Partitioner.py file name to partitioner.py
Change GraphManipulation.py file name to graph_manipulation.py
Move test_replace_target_nodes_with() to test_fx_experimental.py
Remove the unnecessary argument in size_based_partition() in Partitioner class
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47914
Reviewed By: gcatron
Differential Revision: D24956653
Pulled By: scottxu0730
fbshipit-source-id: 25b65be7dc7d64e90ffdc59cf394446fee83c3e6