Reordering tests experiment (#106347)

Companion with https://github.com/pytorch/test-infra/pull/4424

Uses the file rating generated by the test infra PR to re order tests.  For each test file, sum the file ratings from the changed files in the PR, and put the tests in order of sum.

A lot of tests are probably going to end up as "prioritized" since it takes anything with a rating > 0 right now.

Sharding is done twice, once on the prioritized tests, and once on the general/non prioritized tests.  Prioritized tests have an order, so they should be sharded according to that order, while general tests don't have an order and are sharded by test time, which should result in more balanced shards.

I'll change the metric name before I merge, i want to quarantine my testing stuff from actual results

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106347
Approved by: https://github.com/ZainRizvi
This commit is contained in:
Catherine Lee
2023-08-09 20:11:09 +00:00
committed by PyTorch MergeBot
parent a44c072c89
commit 7dfab082be
6 changed files with 211 additions and 156 deletions

View File

@ -394,28 +394,24 @@ class TestParsePrevTests(unittest.TestCase):
"tools.testing.test_selections._get_modified_tests",
return_value={"test2", "test4"},
)
@mock.patch(
"tools.testing.test_selections._get_file_rating_tests", return_value=["test1"]
)
def test_get_reordered_tests(
self, mock_get_prev_failing_tests: Any, mock_get_modified_tests: Any
self,
mock_get_prev_failing_tests: Any,
mock_get_modified_tests: Any,
mock_get_file_rating_tests: Any,
) -> None:
tests = [
ShardedTest(name="test1", shard=1, num_shards=2, time=600.0),
ShardedTest(name="test2", shard=1, num_shards=2, time=500.0),
ShardedTest(name="test3", shard=1, num_shards=2, time=400.0),
ShardedTest(name="test4", shard=1, num_shards=2, time=300.0),
ShardedTest(name="test5", shard=1, num_shards=2, time=200.0),
]
tests = ["test1", "test2", "test3", "test4", "test5"]
expected_prioritized_tests = {"test4", "test2"}
expected_remaining_tests = {"test1", "test3", "test5"}
expected_prioritized_tests = ["test4", "test2", "test1"]
expected_remaining_tests = {"test3", "test5"}
prioritized_tests, remaining_tests = get_reordered_tests(tests)
# Just want to check the names of the tests
prioritized_tests_name = {test.name for test in prioritized_tests}
remaining_tests_name = {test.name for test in remaining_tests}
self.assertSetEqual(expected_prioritized_tests, prioritized_tests_name)
self.assertSetEqual(expected_remaining_tests, remaining_tests_name)
self.assertListEqual(expected_prioritized_tests, prioritized_tests)
self.assertSetEqual(expected_remaining_tests, set(remaining_tests))
def test_compute_prioritization_time_savings_with_multiple_threads(self) -> None:
tests = [