pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Catherine Lee	e0238577b6	Always import test selection tools (#107644 ) https://github.com/pytorch/pytorch/pull/107070 made emit_metrics importable without boto3, so we could just import all the files without the try catch. Pull Request resolved: https://github.com/pytorch/pytorch/pull/107644 Approved by: https://github.com/huydhn, https://github.com/malfet	2023-08-22 16:36:20 +00:00
Zain Rizvi	5ddb8ef827	Make emit_metrics importable without having boto3 installed (#107070 ) Make it so that scripts can import and run the `emit_metrics` function even if they don't have boto3 installed, in which case it will still validate the inputs but skip the actual metric emission part. It's purely a refactor without any real logic changes Motivation: So that run_test.py and the target determination code can use this library easily without worrying about if it was imported or if it's dependencies are installed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/107070 Approved by: https://github.com/huydhn	2023-08-21 21:13:01 +00:00
Catherine Lee	3b2c5d47c0	Use default build env and test config for test times (#107325 ) Redo of #107312 Pairs with https://github.com/pytorch/test-infra/pull/4476 If build env and test config combo cannot be found in the test times, use default. Then we don't have to go manually change the test-times.json a new job is added or we update the jobs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/107325 Approved by: https://github.com/huydhn	2023-08-21 18:39:55 +00:00
FFFrog	e108f33299	Update distutils.Version to packaging.version due to the deprecation … (#107207 ) Update distutils.Version to packaging.version due to the deprecation warning. ```python /root/Git.d/pytorch/pytorch/torch/testing/_internal/common_methods_invocations.py:17136: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. active_if=TEST_SCIPY and LooseVersion(scipy.__version__) < "1.4.0"), /root/Git.d/pytorch/pytorch/torch/testing/_internal/common_methods_invocations.py:17138: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. active_if=TEST_SCIPY and LooseVersion(scipy.__version__) < "1.4.0"), /root/Git.d/pytorch/pytorch/torch/testing/_internal/common_methods_invocations.py:17140: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. active_if=TEST_SCIPY and LooseVersion(scipy.__version__) < "1.4.0"), ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/107207 Approved by: https://github.com/soulitzer	2023-08-17 11:19:44 +00:00
Catherine Lee	f16be5e0d4	Reordering tests experiment (#106347 ) Companion with https://github.com/pytorch/test-infra/pull/4424 Uses the file rating generated by the test infra PR to re order tests. For each test file, sum the file ratings from the changed files in the PR, and put the tests in order of sum. A lot of tests are probably going to end up as "prioritized" since it takes anything with a rating > 0 right now. Sharding is done twice, once on the prioritized tests, and once on the general/non prioritized tests. Prioritized tests have an order, so they should be sharded according to that order, while general tests don't have an order and are sharded by test time, which should result in more balanced shards. I'll change the metric name before I merge, i want to quarantine my testing stuff from actual results Pull Request resolved: https://github.com/pytorch/pytorch/pull/106347 Approved by: https://github.com/ZainRizvi	2023-08-16 18:23:09 +00:00
PyTorch MergeBot	9858edd99f	Revert "Reordering tests experiment (#106347 )" This reverts commit 7dfab082be9eaeeee95c7b0363e59c824c6a9009. Reverted https://github.com/pytorch/pytorch/pull/106347 on behalf of https://github.com/clee2000 due to probably broke sharding ([comment](https://github.com/pytorch/pytorch/pull/106347#issuecomment-1675542738))	2023-08-11 23:59:48 +00:00
Richard Zou	b9ad7bc533	Don't run test/autograd/test_fallback.py in parallel (#106866 ) Fixes https://github.com/pytorch/pytorch/issues/106754 This PR: - moves test/autograd/test_fallback.py to test_autograd_fallback.py and removes it from test_autograd.py (necessary for the next step) - adds test_autograd_fallback.py to parallel test blocklist. - lintrunner really wanted to make changes to the files, but other than that, it is a move. The problem is that we set a global option (the autograd fallback mode) during these tests which may cause the tests to interfere with each other. Test Plan: - python test/run_test.py -i test_autograd_fallback NOTE to diff train oncall: - You'll also need to modify the test/autograd/test_fallback.py TARGET in caffe2/test/TARGETS since we renamed the file. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106866 Approved by: https://github.com/soulitzer	2023-08-10 00:26:23 +00:00
Catherine Lee	7dfab082be	Reordering tests experiment (#106347 ) Companion with https://github.com/pytorch/test-infra/pull/4424 Uses the file rating generated by the test infra PR to re order tests. For each test file, sum the file ratings from the changed files in the PR, and put the tests in order of sum. A lot of tests are probably going to end up as "prioritized" since it takes anything with a rating > 0 right now. Sharding is done twice, once on the prioritized tests, and once on the general/non prioritized tests. Prioritized tests have an order, so they should be sharded according to that order, while general tests don't have an order and are sharded by test time, which should result in more balanced shards. I'll change the metric name before I merge, i want to quarantine my testing stuff from actual results Pull Request resolved: https://github.com/pytorch/pytorch/pull/106347 Approved by: https://github.com/ZainRizvi	2023-08-09 20:11:11 +00:00
Aaron Gokaslan	6d43c89f37	[BE]: Update Ruff to 0.0.280 (#105724 ) Removes unusued loop values in python dictionary iteration. Automated fix from Ruff master Pull Request resolved: https://github.com/pytorch/pytorch/pull/105724 Approved by: https://github.com/ezyang, https://github.com/janeyx99	2023-07-22 23:03:34 +00:00
Justin Chu	4cc1745b13	[BE] f-stringify torch/ and scripts (#105538 ) This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`. - https://docs.python.org/3/reference/lexical_analysis.html#f-strings - https://pypi.org/project/flynt/ Command used: ``` flynt torch/ -ll 120 flynt scripts/ -ll 120 flynt tools/ -ll 120 ``` and excluded `collect_env.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538 Approved by: https://github.com/ezyang, https://github.com/malfet	2023-07-21 19:35:24 +00:00
Justin Chu	73e1455327	[BE] Enable ruff's UP rules and autoformat test/ (#105434 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105434 Approved by: https://github.com/albanD	2023-07-19 20:36:06 +00:00
Joel Schlosser	ece19bf018	Update run_test.py to use TEST_WITH_SLOW_GRADCHECK flag (#104819 ) Finishes the job from #104537. See https://github.com/pytorch/pytorch/pull/104537#pullrequestreview-1520065008 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104819 Approved by: https://github.com/huydhn	2023-07-11 21:58:46 +00:00
Yukio Siraichi	40b8d10d5e	Re-land: Turn translation validation on for tests and accuracy runs by default. (#104467 ) Re-landing: #103611 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104467 Approved by: https://github.com/malfet	2023-07-05 19:01:50 +00:00
Nikita Shulga	ddd7da7546	Enable more tests (#104437 ) Remove `test_segment_reductions` from list of blocklisted tests Remove `@onlyCPU` qualifier from test_segment_reductions as it has CUDA specific parts Fixes https://github.com/pytorch/pytorch/issues/104410 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104437 Approved by: https://github.com/atalman, https://github.com/huydhn	2023-06-30 16:26:11 +00:00
PyTorch MergeBot	a2a8b4d415	Revert "Turn translation validation on for tests and accuracy runs by default. (#103611 )" This reverts commit e311bed2a8e014f0ccf6fdc3fce11884982ac930. Reverted https://github.com/pytorch/pytorch/pull/103611 on behalf of https://github.com/malfet due to Broke inductor tests ([comment](https://github.com/pytorch/pytorch/pull/103611#issuecomment-1614850276))	2023-06-30 15:54:18 +00:00
Yukio Siraichi	e311bed2a8	Turn translation validation on for tests and accuracy runs by default. (#103611 ) This PR turns translation validation on by default for tests and accuracy benchmark runs. It also installs Z3 on CI. The main changes are: - Add `--no-translation-validation` as an option in _test/run_tests.py_ - Set `PYTORCH_TEST_WITH_TV` environment variable - Add `TEST_WITH_TV` variable in _torch/testing/_internal/common_utils.py_ - Turn translation validation on for accuracy benchmarks in _benchmarks/dynamo/common.py_ - Add Z3 installation on CI scripts Pull Request resolved: https://github.com/pytorch/pytorch/pull/103611 Approved by: https://github.com/ezyang	2023-06-30 01:32:21 +00:00
Nikita Shulga	c40f5edf7b	Change tools search order (#104214 ) Prevents following cryptic error if one attempts to use `run_tests.py` on system that also has torchaudio installed in dev mode (as `tools` from https://github.com/pytorch/audio might take precedence, but this is not how script should behave): ``` Unable to import test_selections from tools/testing. Running without test selection stats.... Reason: No module named 'tools.stats' Traceback (most recent call last): File "/Users/nshulga/git/pytorch/pytorch/test/run_test.py", line 1673, in <module> main() File "/Users/nshulga/git/pytorch/pytorch/test/run_test.py", line 1604, in main selected_tests = get_selected_tests(options) File "/Users/nshulga/git/pytorch/pytorch/test/run_test.py", line 1418, in get_selected_tests path = os.path.join(str(REPO_ROOT), TEST_TIMES_FILE) NameError: name 'TEST_TIMES_FILE' is not defined ``` But make sure to remove it in the end, otherwise it will not work if torch is installed from wheel, but tests are running from clean repo checkout. <!-- copilot:poem --> ### <samp>🤖 Generated by Copilot at dd52521</samp> > _Sing, O Muse, of the cunning code review_ > _That fixed the tests of the `tools` module_ > _By adding and removing the root path_ > _As a shepherd guides his flock to and fro._ Pull Request resolved: https://github.com/pytorch/pytorch/pull/104214 Approved by: https://github.com/kit1980	2023-06-27 15:54:34 +00:00
Nikita Shulga	925f0a01c7	Do not pass `stepcurrent` option unless in CI (#104135 ) Should allow one to run the same tests multiple times on local machine <!-- copilot:poem --> ### <samp>🤖 Generated by Copilot at 740a92d</samp> > _`pytest_args` change_ > _Only add `--sc` on CI_ > _Avoid conflicts - fall_ Pull Request resolved: https://github.com/pytorch/pytorch/pull/104135 Approved by: https://github.com/huydhn, https://github.com/kit1980	2023-06-24 09:34:14 +00:00
Nikita Shulga	63f66d19ea	[Tests] Make `run_test.py` usable without boto3 (#104111 ) There is a `HAVE_TEST_SELECTION_TOOLS` conditional, but turns out it does not really work, so fix it by defining all missing prototypes and make it work as single-shard instance Add lint rule to test stat it would succeed for runnign only test_cuda with released version of PyTorch Pull Request resolved: https://github.com/pytorch/pytorch/pull/104111 Approved by: https://github.com/clee2000, https://github.com/ZainRizvi	2023-06-24 03:10:49 +00:00
Nikita Shulga	98d513cabf	[BE][Test] Remove `--pytest` option from `run_test.py` (#104125 ) Because we always run tests with pytest now. Marking it as `bc-breaking` as there could technically be some scripts depending on it somewhere... <!-- copilot:poem --> ### <samp>🤖 Generated by Copilot at 1760568</samp> > _`pytest` option gone_ > _simpler test runner script_ > _autumn leaves fall fast_ Pull Request resolved: https://github.com/pytorch/pytorch/pull/104125 Approved by: https://github.com/seemethere	2023-06-24 00:20:20 +00:00
Catherine Lee	7ac1c64bc4	Exclude _nvfuser from test collection (#104003 ) The three files in this folder are run by should instead be run by test_jit_cuda_fuser.py, test_nvfuser_dynamo.py, and test_nvfuser_frontend.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/104003 Approved by: https://github.com/huydhn, https://github.com/jjsjann123	2023-06-22 19:46:45 +00:00
Zain Rizvi	c3d3165f16	Enable uploading metrics and upload Test Reordering metrics to dynamodb (#102691 ) Added a feature to upload test statistics to DynamoDB and Rockset using a new function `emit_metric` in `tools/stats/upload_stats_lib.py`. Added metrics to measure test reordering effectiveness in `tools/testing/test_selections.py`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/102691 Approved by: https://github.com/malfet	2023-06-12 23:01:53 +00:00
PyTorch MergeBot	b52ee80cdc	Revert "Add print statements to debug sharding error (#102713 )" This reverts commit c7873522c2ceefbc3b747224da1d26d566115c9a. Reverted https://github.com/pytorch/pytorch/pull/102713 on behalf of https://github.com/clee2000 due to issue should be resolved now ([comment](https://github.com/pytorch/pytorch/pull/102713#issuecomment-1583334560))	2023-06-08 21:02:17 +00:00
Aidyn-A	591134f2a5	[CI] Enable UCC in CI (#100395 ) UCC was temporarily disabled in #98832. This PR re-enables it with the necessary fix. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100395 Approved by: https://github.com/atalman	2023-06-08 19:01:22 +00:00
Catherine Lee	c7873522c2	Add print statements to debug sharding error (#102713 ) sharding on rocm is broken, i cant replicate on dummy PRs even though it seems to happen pretty often on main, so adding this to increase my sample size. Hopefully this is enough print statements... Pull Request resolved: https://github.com/pytorch/pytorch/pull/102713 Approved by: https://github.com/huydhn	2023-06-01 22:38:28 +00:00
Zain Rizvi	c84f246c83	Improve time savings calculation math for test reordering (#102411 ) Use a more accurate method that accounts for tests being run in parallel Right now we still log results to the console, but later it'll get logged to Rockset for better tracking Pull Request resolved: https://github.com/pytorch/pytorch/pull/102411 Approved by: https://github.com/huydhn, https://github.com/malfet	2023-05-31 23:51:27 +00:00
Catherine Lee	a5ddb72aec	Quick fix for keep-going + reruns (#102569 ) Currently file level reruns + stepcurrent are incompatible and it's making PRs green when they are actually red, so turn off stepcurrent + file level reruns when keep-going is used until I figure out a better way to do this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/102569 Approved by: https://github.com/huydhn, https://github.com/malfet	2023-05-31 04:46:25 +00:00
PyTorch MergeBot	1a6ab8a5dc	Revert "Quick fix for keep-going + reruns (#102569 )" This reverts commit 7f6edcf422d133b6fd747ec0775d1c840a91ee46. Reverted https://github.com/pytorch/pytorch/pull/102569 on behalf of https://github.com/clee2000 due to broke a ton of stuff ([comment](https://github.com/pytorch/pytorch/pull/102569#issuecomment-1569167673))	2023-05-30 22:04:27 +00:00
Catherine Lee	7f6edcf422	Quick fix for keep-going + reruns (#102569 ) Currently file level reruns + stepcurrent are incompatible and it's making PRs green when they are actually red, so turn off stepcurrent + file level reruns when keep-going is used until I figure out a better way to do this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/102569 Approved by: https://github.com/huydhn	2023-05-30 21:29:56 +00:00
Huy Do	6e3e3dd477	Do not collect and skip non-disabled tests when rerunning disabled tests (#102107 ) The console log blows up to much when running in rerun disabled tests mode (x50) `e132f09e88`. Each log is around 1GB and the whole uncompressed logs is ~50GB. After compression, it will be around 1GB, still too big. The increase comes mainly from the multiple SKIPPED message for non-disabled tests, which is expected due to how SkipTest and pytest-flakyfinder currently work. I update `test/conftest.py` to completely ignore skipped tests when rerunning disabled test instead of collecting then skipping 50 tests each. The benefit of doing is is much more than I originally expect: * Rerun disabled tests jobs now finish in less than half an hour as they should be * Fix OOM runner crash because of too many collected tests * Fix verbosity issue as now only disabled tests are run x50 times. There are only few hundreds of them atm * Fix timed out issue when rerunning disabled distributed and ASAN tests. They are just too slow when running at x50 ### Testing When rerunning disabled tests https://github.com/pytorch/pytorch/actions/runs/5084508614, only disabled tests on the platform are run, for example `test_ops_jit` on https://ossci-raw-job-status.s3.amazonaws.com/log/13770164954 only ran 100 tests (`test_variant_consistency_jit_linalg_lu_cuda_float32` + `test_variant_consistency_jit_linalg_lu_factor_cuda_complex64`) x50. ``` Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops_jit.py', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '--sc=test_ops_jit_1', '--flake-finder', '--flake-runs=50', '--import-slow-tests', '--import-disabled-tests', '--rerun-disabled-tests'] ... [2023-05-25 21:32:49.763856] Expand the folded group to see the log file of test_ops_jit 2/2 ##[group]PRINTING LOG FILE of test_ops_jit 2/2 (/var/lib/jenkins/workspace/test/test-reports/test_ops_jit_h2wr_t2c.log) Test results will be stored in test-reports/python-pytest/test_ops_jit/test_ops_jit-51a83bd44549074e.xml ============================= test session starts ============================== platform linux -- Python 3.10.11, pytest-7.3.1, pluggy-1.0.0 -- /opt/conda/envs/py_3.10/bin/python cachedir: .pytest_cache hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] rootdir: /var/lib/jenkins/workspace configfile: pytest.ini plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-11.1.2, shard-0.1.2, xdist-3.3.0, xdoctest-1.1.0 collecting ... collected 1084 items Running 100 items in this shard: test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_cuda_float32 (x50), test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_factor_cuda_complex64 (x50) stepcurrent: Cannot find last run test, not skipping test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_cuda_float32 PASSED [2.1876s] [ 1%] test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_factor_cuda_complex64 PASSED [4.5615s] [ 2%] ``` * [pull](https://github.com/pytorch/pytorch/actions/runs/5093566864) * [trunk](https://github.com/pytorch/pytorch/actions/runs/5095364311) * [periodic](https://github.com/pytorch/pytorch/actions/runs/5095378850) * [slow](https://github.com/pytorch/pytorch/actions/runs/5095390285) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102107 Approved by: https://github.com/clee2000, https://github.com/malfet	2023-05-27 12:10:36 +00:00
Catherine Lee	2232cce69c	No cpp + step current (#102001 ) stepcurrent cannot handle xdist Pull Request resolved: https://github.com/pytorch/pytorch/pull/102001 Approved by: https://github.com/huydhn	2023-05-24 17:39:32 +00:00
Huy Do	d06802778e	No need to run C++ tests under rerun disabled tests mode (#102132 ) Per title. I extract this part out of the draft PR that I'm working on https://github.com/pytorch/pytorch/pull/102107 because the remaining issues with rerun disabled tests: log size and unexpected runner failures requires some further investigations while this one is clearing breaking in trunk atm. Until we can support disable C++ tests, there is no need to run them in rerun disabled tests mode. ### Testing Coming from https://github.com/pytorch/pytorch/pull/102107, for example https://github.com/pytorch/pytorch/actions/runs/5062224659/jobs/9087747981 ``` 2023-05-23T22:46:50.1953318Z Running cpp/basic 1/1 ... [2023-05-23 22:46:50.195077] 2023-05-23T22:46:50.1953847Z Skipping C++ tests when running under RERUN_DISABLED_TESTS mode 2023-05-23T22:46:50.2066032Z Running cpp/atest 1/1 ... [2023-05-23 22:46:50.206348] 2023-05-23T22:46:50.2066435Z Skipping C++ tests when running under RERUN_DISABLED_TESTS mode 2023-05-23T22:46:52.2666743Z No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda' 2023-05-23T22:46:52.2691817Z Ignoring disabled issues: [] ... ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/102132 Approved by: https://github.com/clee2000	2023-05-24 07:45:48 +00:00
Huy Do	d26c8f26d1	Lower xdist processes from auto to NUM_PROCS (#102124 ) This is to avoid CUDA OOM issues when running C++ tests both regularly and in memory leak check mode. Pull Request resolved: https://github.com/pytorch/pytorch/pull/102124 Approved by: https://github.com/clee2000	2023-05-24 06:50:55 +00:00
Catherine Lee	f3fc531eee	Check for pytest extensions in run_test (#100916 ) not very elegant checked on separate conda env that doesnt have the usual ci dependencies the two pytest extensions at fault are pytest-rerunfailures and pytest-shard, also included pytest-flakefinder just incase no idea if this is a good way to do this could also check individually and add flags based on that, but was told that needing to requiring all the ci dependencies to be downloaded was also ok Pull Request resolved: https://github.com/pytorch/pytorch/pull/100916 Approved by: https://github.com/huydhn	2023-05-17 20:27:55 +00:00
Catherine Lee	e3c9a1e5c4	Run dynamo tests in parallel (#101432 ) cuts off ~30 min per shard (2 shards and 2 python versions so 2 hours total) Pull Request resolved: https://github.com/pytorch/pytorch/pull/101432 Approved by: https://github.com/huydhn, https://github.com/desertfire, https://github.com/ZainRizvi	2023-05-17 20:26:24 +00:00
Huy Do	552b712f80	Run C++ testcases in parallel with pytest-xdist (#101440 ) After an investigation, running C++ tests with https://github.com/pytest-dev/pytest-cpp is just slower than running them directly, plain and simple. I'm curious on the exact root cause, but that's a story for another day. `time build/bin/test_lazy` takes half a minute to run 610 tests on `linux-bionic-cuda11.8-py3.10-gcc7 / test (default, 2, 5, linux.4xlarge.nvidia.gpu)` while `time pytest /var/lib/jenkins/workspace/build/bin/test_lazy -v` takes 20+ minutes on the same runner. This is a very costly price to pay. The saving grace here is that https://github.com/pytest-dev/pytest-cpp supports pytest-xdist to run tests in parallel with `-n auto`, so `time pytest /var/lib/jenkins/workspace/build/bin/test_lazy -v -n auto` takes only 3 minutes. This is still not as fast as running C++ tests directly, but it's order of magnitude faster than running them sequentially. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101440 Approved by: https://github.com/clee2000	2023-05-16 21:52:36 +00:00
Huy Do	35834a405c	Run C++ tests on CI with run_test.py (#99956 ) After https://github.com/pytorch/pytorch/pull/99559, we can now run C++ test with `run_test.py`. Although advance features such as `--import-slow-tests` and `--import-disabled-tests` won't work for now, there will still be a gain in reliability and performance as C++ can now be retried and run in parallel. This covers all C++ tests in the CI including aten, libtorch, and Vulkan C++ tests across all platforms Linux, Windows, MacOS. Notes: * To support C++ test discovery, the env variable `CPP_TESTS_DIR` can be set to where the C++ test binaries is located * Support pytest -k argument via run_test as this is used by pytest-cpp to replace `--gtest-filter` * The XML output is in pytest format, but it's ok now because we don't have slow test or flaky test support for C++ test yet * ~~I need to figure out why conftest.py doesn't work when I invoke pytest directly for C++ test, so `--sc` is not available for C++ tests at the moment. Proper pytest plugin like stepwise works fine though. I'll investigate and fix it in a separate PR~~ Found the cause, `conftest.py` is per directory and needs to be in any arbitrary directory that holds C++ test * Two tests `test_api` and `test_tensorexpr` timed out on ASAN, I suspect that ASAN is now used on top of the python executable, which is slower than running native C++ code. IMO, it's ok to run these tests as before on ASAN for now Pull Request resolved: https://github.com/pytorch/pytorch/pull/99956 Approved by: https://github.com/clee2000, https://github.com/ZainRizvi	2023-05-09 21:24:12 +00:00
Ramin Azarmehr	cecfcf1e17	[MPS] Handle MPS failures of test_modules.py in common_modules.py (#95334 ) - Also cleaned up `test_modules.py` from skipMPS code. - Added `skipMPS` for unsupported or failing tests on MPS backend in common_modules.py. (We'll remove `skipMPS` from those tests once a fix is available for them.) Pull Request resolved: https://github.com/pytorch/pytorch/pull/95334 Approved by: https://github.com/kulinseth, https://github.com/albanD	2023-05-09 03:55:16 +00:00
Zain Rizvi	95f191a248	Always run prioritized tests first, even if they're expected to run serially (#100748 ) Today, we prioritize running test files that were edited in the user's PR, with the idea being to run them before we run any other test. Except, if the modified test is supposed to run serially, then we still end up running it after all the parallelized tests have finished running. This PR fixes that to _always_ run the prioritized tests before the regular tests, regardless of if the test is supposed to run serially or in parallel Pull Request resolved: https://github.com/pytorch/pytorch/pull/100748 Approved by: https://github.com/huydhn	2023-05-08 20:23:46 +00:00
Catherine Lee	a1f318daba	Fix get_reordered_tests in run_test.py (#100752 ) i think get_reordered_tests broken since master -> main switch add typing for some functions checked for `prioritized` in the logs limited testing because I only care about one very small part of the log thats near the beginning Pull Request resolved: https://github.com/pytorch/pytorch/pull/100752 Approved by: https://github.com/huydhn	2023-05-05 22:46:56 +00:00
Catherine Lee	e88e92e7a2	Update to reruns + timeouts in run_test.py (#100412 ) https://github.com/pytorch/pytorch/pull/100200/files made unknown tests more likely to fail b/c lacking test times but still have time outs, so fix that Pull Request resolved: https://github.com/pytorch/pytorch/pull/100412 Approved by: https://github.com/huydhn	2023-05-01 21:51:53 +00:00
pbialecki	73645a8412	Add CUDA 12.1 CI workflows (#98832 ) Adds CUDA 12.1 CI workflows, removes CUDA 11.7. CC @malfet Pull Request resolved: https://github.com/pytorch/pytorch/pull/98832 Approved by: https://github.com/atalman	2023-05-01 16:25:53 +00:00
PyTorch MergeBot	9075e3c2c6	Revert "Run test_fx_to_onnx_with_onnxruntime serially (#100298 )" This reverts commit 3a3f781f6cd90abbceb63a9cb59546d892ef899e. Reverted https://github.com/pytorch/pytorch/pull/100298 on behalf of https://github.com/huydhn due to No need as https://github.com/pytorch/pytorch/pull/100297 has been landed ([comment](https://github.com/pytorch/pytorch/pull/100298#issuecomment-1528476786))	2023-04-29 02:07:39 +00:00
Huy Do	3a3f781f6c	Run test_fx_to_onnx_with_onnxruntime serially (#100298 ) This test starts to fail out of nowhere in trunk Pull Request resolved: https://github.com/pytorch/pytorch/pull/100298 Approved by: https://github.com/kit1980	2023-04-29 00:51:25 +00:00
Catherine Lee	6ab9453ea9	File level rerun changes (#100200 ) Fixes #ISSUE_NUMBER * change hook so that test still gets saved in --sc when fails in test setup (caused an off by 1 error due to setup being called before the logreport hook) * allow reruns for all tests now that --sc is used * increase number of reruns now that --sc is used Pull Request resolved: https://github.com/pytorch/pytorch/pull/100200 Approved by: https://github.com/huydhn	2023-04-28 20:57:49 +00:00
Catherine Lee	ae5e1819a5	stepcurrent (#98035 ) * add stepcurrent flag (--sc) based off the stepwise flag that saves the currently running test so that test running can resume from the last successful test after segfaults, takes in an argument for a key so that different test runs dont overwrite each other * send sigint to process when timeout so that xml can be made * add currently unused stepcurrent skip flag (--scs) based off stepwise skip flag that skips the failing test, was going to use if for the keep-going label but having trouble with CI Pull Request resolved: https://github.com/pytorch/pytorch/pull/98035 Approved by: https://github.com/huydhn	2023-04-25 20:56:04 +00:00
Huy Do	96d3f3dee3	Discover and run C++ tests with run_test.py (#99559 ) This depends on [pytest-cpp](https://github.com/pytest-dev/pytest-cpp) to discover and run C++ tests with pytest. C++ tests are built under `${WORKSPACE}/build/bin` directory and copied to the test job under the same path. * To expose them to `run_test`, I choose to use the mock path prefix `cpp`, for example `build/bin/c10_Array_test` would be named as `cpp/c10_Array_test` and the `python test/run_test.py --cpp -i cpp/c10_Array_test` would run the test in the same way as other Python tests. I could copy them from `build/bin` to `test/cpp`, but it will be mixed with the source code and CMake file. So this looks easier * Some executable under `build/bin` are not C++ tests, and they are exclude, for example `build/bin/torch_shm_manager` * C++ tests need to run with pytest directly as python command doesn't understand it * The change is gated by the new `--cpp` argument to `run_test.py`, for example `python test/run_test.py --cpp` will run all available C++ tests * The tests can be run in parallel * Failing tests can be retried with `--reruns=2` and `--sw` ``` ============================= test session starts ============================== platform darwin -- Python 3.9.15, pytest-7.2.0, pluggy-1.0.0 -- /Users/huydo/miniconda3/envs/py3.9/bin/python3 cachedir: .pytest_cache hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/Users/huydo/Storage/mine/pytorch/test/.hypothesis/examples') rootdir: /Users/huydo/Storage/mine/pytorch, configfile: pytest.ini plugins: xdoctest-1.1.0, cpp-2.3.0, rerunfailures-10.3, shard-0.1.2, flakefinder-1.1.0, hypothesis-6.56.4, xdist-3.0.2, repeat-0.9.1 collecting ... collected 3 items / 2 deselected / 1 selected Running 1 items in this shard: build/bin/scalar_tensor_test::TestScalarTensor.TestScalarTensorMPS stepwise: skipping 2 already passed items. ../build/bin/scalar_tensor_test::TestScalarTensor::TestScalarTensorMPS RERUN [100%] ../build/bin/scalar_tensor_test::TestScalarTensor::TestScalarTensorMPS RERUN [100%] ../build/bin/scalar_tensor_test::TestScalarTensor::TestScalarTensorMPS FAILED [100%] ``` * `--import-slow-tests` and `--import-disabled-tests` won't work for now and that's ok to have it as a future task. I also add `pytest-cpp==2.3.0` to Linux Docker, MacOS, and Windows. ### Testing Build PyTorch and run `python test/run_test.py --cpp` on my laptop. CI change would come later in a separate PR. Also running `python test/run_test.py --help` now shows all C++ test discovered under `build/bin` Pull Request resolved: https://github.com/pytorch/pytorch/pull/99559 Approved by: https://github.com/clee2000	2023-04-22 00:23:31 +00:00
Zain Rizvi	7546972565	[BE] Refactoring test execution and improving comments (#99467 ) Sharing code between the code that handles test results in parallel vs serial mode. Note that the original version of this code had an inconsistency between the two versions where it would execute `print_to_stderr(err_message)` on every test that ran in parallel, but for serial tests it would only invoke `print_to_stderr(err_message)` if `continue_on_error` was also specified. By sharing code, this PR changes that behavior to be consistent between the two modes. Also adding some comments. <!-- copilot:poem --> ### <samp>🤖 Generated by Copilot at 029342c</samp> > _Sing, O Muse, of the skillful coder who refined_ > _The PyTorch testing script, `run_test.py`, and shined_ > _A light on its obscure logic, with docstrings and comments_ > _And made it run more smoothly, with better error contents_ Pull Request resolved: https://github.com/pytorch/pytorch/pull/99467 Approved by: https://github.com/huydhn, https://github.com/malfet	2023-04-19 19:29:07 +00:00
BowenBao	d41aa448b8	[ONNX] Run ONNX tests as part of standard run_test script (#99215 ) <!-- copilot:all --> ### <samp>🤖 Generated by Copilot at dcbf7e2</samp> ### Summary 📝🧹🚩 <!-- 1. 📝 for simplifying the `./scripts/onnx/test.sh` script 2. 🧹 for refactoring the `test/onnx/dynamo/test_exporter_api.py` file 3. 🚩 for adding the `--onnx` flag to `test/run_test.py` and updating the `TESTS` list --> This pull request improves the ONNX testing infrastructure in PyTorch by refactoring the test code, normalizing the scope names, adding a flag to run only the ONNX tests, and simplifying the test script. > _To export PyTorch models to ONNX_ > _We refactored some scripts and contexts_ > _We used `common_utils`_ > _And normalized the scopes_ > _And added a flag to run the tests_ ### Walkthrough * Simplify `./scripts/onnx/test.sh` to use `run_test.py` with `--onnx` flag instead of `pytest` ([link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-0017f5b22ae1329acb0f54af8d9811c9b6180a72dac70d7a5b89d7c23c958198L44-R46)) * Remove `onnx` test from `TESTS` list in `test/run_test.py` ([link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-e72503c9e3e8766e2d1bacf3fad7b88aa166e0e90a7e103e7df99357a35df8d7L127-R127)). Replace with `onnx_caffe2`. * Add `onnx/test_pytorch_onnx_onnxruntime_cuda` and `onnx/test_models` tests to `blocklisted_tests` list in `test/run_test.py` ([link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-e72503c9e3e8766e2d1bacf3fad7b88aa166e0e90a7e103e7df99357a35df8d7R154-R155)) * Add `ONNX_SERIAL_LIST` list to `test/run_test.py` to specify ONNX tests that must run serially ([link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-e72503c9e3e8766e2d1bacf3fad7b88aa166e0e90a7e103e7df99357a35df8d7R296-R301)) * Add `ONNX_TESTS` list to `test/run_test.py` to store all ONNX tests ([link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-e72503c9e3e8766e2d1bacf3fad7b88aa166e0e90a7e103e7df99357a35df8d7R370)) * Add `--onnx` flag to `parse_args` function in `test/run_test.py` to run only ONNX tests ([link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-e72503c9e3e8766e2d1bacf3fad7b88aa166e0e90a7e103e7df99357a35df8d7R920-R928)) * Include `ONNX_SERIAL_LIST` in `must_serial` function in `test/run_test.py` to run ONNX tests serially or parallelly based on memory usage ([link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-e72503c9e3e8766e2d1bacf3fad7b88aa166e0e90a7e103e7df99357a35df8d7R1120)) * Filter selected tests based on `--onnx` flag in `get_selected_tests` function in `test/run_test.py` to exclude non-ONNX tests ([link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-e72503c9e3e8766e2d1bacf3fad7b88aa166e0e90a7e103e7df99357a35df8d7R1158-R1165)) ### Other minor changes to accommodate this change * Replace `unittest` module with `common_utils.TestCase` in `test/onnx/dynamo/test_exporter_api.py` ([link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-4545f0c15c73ebe90a875e9bee6c5ca4b6b92fb1ed0ec5560d1568e0f6339d02L4), [link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-4545f0c15c73ebe90a875e9bee6c5ca4b6b92fb1ed0ec5560d1568e0f6339d02L29-R28), [link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-4545f0c15c73ebe90a875e9bee6c5ca4b6b92fb1ed0ec5560d1568e0f6339d02L71-R70), [link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-4545f0c15c73ebe90a875e9bee6c5ca4b6b92fb1ed0ec5560d1568e0f6339d02L147-R146)) * Import `TemporaryFileName` class from `common_utils` in `test/onnx/dynamo/test_exporter_api.py` ([link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-4545f0c15c73ebe90a875e9bee6c5ca4b6b92fb1ed0ec5560d1568e0f6339d02L19-R18)) * Use `common_utils.TemporaryFileName` instead of `TemporaryFileName` in `TestDynamoExportAPI` class in `test/onnx/dynamo/test_exporter_api.py` ([link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-4545f0c15c73ebe90a875e9bee6c5ca4b6b92fb1ed0ec5560d1568e0f6339d02L92-R91), [link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-4545f0c15c73ebe90a875e9bee6c5ca4b6b92fb1ed0ec5560d1568e0f6339d02L110-R109), [link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-4545f0c15c73ebe90a875e9bee6c5ca4b6b92fb1ed0ec5560d1568e0f6339d02L129-R128)) * Use `common_utils.run_tests` instead of `unittest.main` in `test/onnx/dynamo/test_exporter_api.py` ([link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-4545f0c15c73ebe90a875e9bee6c5ca4b6b92fb1ed0ec5560d1568e0f6339d02L155-R154)) * Add `re` module to `test/onnx/test_utility_funs.py` ([link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-da71d2c81c9dc7ac0c47ff086fded82e4edcb67ba0cd3d8b5c983d7467343bc7R6)) * Add `_remove_test_environment_prefix_from_scope_name` function to `test/onnx/test_utility_funs.py` to normalize scope names of ONNX nodes ([link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-da71d2c81c9dc7ac0c47ff086fded82e4edcb67ba0cd3d8b5c983d7467343bc7R32-R58)) * Use `_remove_test_environment_prefix_from_scope_name` function to compare scope names of ONNX nodes in `TestUtilityFuns` class in `test/onnx/test_utility_funs.py` ([link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-da71d2c81c9dc7ac0c47ff086fded82e4edcb67ba0cd3d8b5c983d7467343bc7L1099-R1133), [link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-da71d2c81c9dc7ac0c47ff086fded82e4edcb67ba0cd3d8b5c983d7467343bc7L1119-R1152), [link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-da71d2c81c9dc7ac0c47ff086fded82e4edcb67ba0cd3d8b5c983d7467343bc7L1170-R1188), [link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-da71d2c81c9dc7ac0c47ff086fded82e4edcb67ba0cd3d8b5c983d7467343bc7L1181-R1199), [link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-da71d2c81c9dc7ac0c47ff086fded82e4edcb67ba0cd3d8b5c983d7467343bc7L1220-R1239), [link](https://github.com/pytorch/pytorch/pull/99215/files?diff=unified&w=0#diff-da71d2c81c9dc7ac0c47ff086fded82e4edcb67ba0cd3d8b5c983d7467343bc7L1235-R1258)) Fixes #98626 Pull Request resolved: https://github.com/pytorch/pytorch/pull/99215 Approved by: https://github.com/huydhn, https://github.com/titaiwangms	2023-04-19 06:17:47 +00:00
Zachary DeVito	7ff1f3f3f6	Revert "Revert "Expandable blocks in allocator (#96995 )"" (#99275 ) This reverts commit 851e89c8e817f28270e0fc21d74ced9446bea747. Differential Revision: [D45034526](https://our.internmc.facebook.com/intern/diff/D45034526) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99275 Approved by: https://github.com/eellison	2023-04-17 23:46:08 +00:00

... 2 3 4 5 6 ...

695 Commits