pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Yuanyuan Chen	a8c528c105	[1/N] Apply UP035 rule in tests (#163947 ) Apply UP035 `ruff` rule in tests, but some tests for `fx` and `dynamo` are excluded in case the old typing is the test target. Pull Request resolved: https://github.com/pytorch/pytorch/pull/163947 Approved by: https://github.com/ezyang	2025-09-29 01:42:01 +00:00
xinan.lin	1853f71b4f	[Fix XPU CI][Inductor UT] Fix test cases broken by community. (#160403 ) Fixes #160243, Fixes #160244, Fixes #160245 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160403 Approved by: https://github.com/janeyx99	2025-08-19 00:54:51 +00:00
Yu, Guangye	c68af9af1b	Fix XPU CI UT test_circular_dependencies (#158189 ) # Motivation fix https://github.com/pytorch/pytorch/issues/110040 Pull Request resolved: https://github.com/pytorch/pytorch/pull/158189 Approved by: https://github.com/Skylion007, https://github.com/cyyever	2025-07-13 09:30:57 +00:00
Pat Vignola	6e107899da	[Torch] Fix crash when comparing fp8 tensors that have more than 1 dimension (#153508 ) Summary: `torch.nonzero` returns as many items as the number of dimensions, so we shouldn't expect a single element for the indices. Test Plan: CI Differential Revision: D74539233 Pull Request resolved: https://github.com/pytorch/pytorch/pull/153508 Approved by: https://github.com/exclamaforte	2025-05-15 08:41:46 +00:00
Aaron Gokaslan	032ef48725	[BE]: Add PEP621 project section to pyproject.toml (#153055 ) Follow up to @ezyang's PR #153020 , but better uses PEP621 to reduce redundant fields and pass through metadata better to uv, setuptools, poetry and other tooling. * Enables modern tooling like uv sync and better support for tools like poetry. * Also allows us to set project wide settings that are respected by linters and IDE (in this example we are able centralize the minimum supported python version). * Currently most of the values are dynamically fetched from setuptools, eventually we can migrate all the statically defined values to pyproject.toml and they will be autopopulated in the setuptool arguments. * This controls what additional metadata shows up on PyPi . Special URL Names are listed here for rendering on pypi: https://packaging.python.org/en/latest/specifications/well-known-project-urls/#well-known-labels These also clearly shows us what fields will need to be migrated to pyproject.toml over time from setup.py per #152276. Static fields be fairly easy to migrate, the dynamically built ones like requirements are a bit more challenging. Without this, `uv sync` complains: ``` error: No `project` table found in: `pytorch/pyproject.toml` ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/153055 Approved by: https://github.com/ezyang	2025-05-12 02:16:07 +00:00
Gabriel Ferns	a40e876b08	Support fp8 dtypes in assert_close (#150002 ) Fixes #135998 Adds support for fp8. These are compared bitwise, without atol and rtol. The implementation uses the same comparison functions, just with atol and rtol forced to zero. The error message is different from the default case; it only tells the user the first mismatch. This is to avoid triggering the error from #135998. Test Plan: New unit test covers new code paths. Pull Request resolved: https://github.com/pytorch/pytorch/pull/150002 Approved by: https://github.com/cyyever, https://github.com/zou3519	2025-04-20 01:24:21 +00:00
Joel Schlosser	ae53510b9e	Fix setUpClass() / tearDownClass() for device-specific tests (#151129 ) Finishes up the work started in #121686 + adds test Update: this was not as straightforward as I originally imagined. Context below. TL;DR: `TestFoo{CPU, CUDA}` now actually derive from `TestFoo`! Also, `{CPU, CUDA}TestBase` setup / teardown logic is now always called (it is required to set the primary device), regardless of whether `super().setUpClass()` / `super().tearDownClass()` are called or not. Background: The typical way to get device-specific tests is to write a generic `TestFoo` and call `instantiate_device_type_tests(TestFoo, locals())` to get `TestFooCPU`, `TestFooCUDA`, etc. After this, generic tests (e.g. `TestFoo.test_bar()`) become `TestFooCPU.test_bar_cpu()` / `TestFooCUDA.test_bar_cuda()`. Behind the scenes, this was historically accomplished by creating a `TestFooCUDA` that derives from both a `CUDATestBase` and an empty class called `TestFoo_base`. This `TestFoo_base` has the same bases as `TestFoo`, but none of the test functions (e.g. `test_bar()`). The documented reason for this is to avoid things like a derived `TestFooCUDA.test_bar()` being discovered in addition to the real device-specific test `TestFooCUDA.test_bar_cuda()`. (1) A reason this matters is because it should be possible to call e.g. `super().setUpClass()` from a custom setup / teardown classmethod. If the generated TestFooCUDA does not derive from TestFoo, but instead derives from the empty class described above, this syntax does not work; in fact there is no way to form a proper `super()` call that works across the device-specific test variants. Here's an example that breaks in the OpInfo tests: `070f389745/test/test_ops.py (L218-L221)` (2) Further, there is some precedent within a custom `setUpClass()` impl for storing things on the `cls` object to be accessed at test time. This must be the device-specific test class (`TestFooCUDA`) and not `TestFoo` for this to work. As an example, the open device registration tests load a module during setup and use it in the test logic: `070f389745/test/test_cpp_extensions_open_device_registration.py (L63-L77)` `070f389745/test/test_cpp_extensions_open_device_registration.py (L79-L80)` To accomplish both (1) and (2) at the same time, I decided to revisit the idea of utilizing a proper inheritance hierarchy for `TestFoo` -> `{TestFooCPU, TestFooCUDA}`. That is: have TestFooCPU / TestFooCUDA actually derive from `TestFoo`. This achieves both (1) and (2). The only thing left is to make sure the generic tests (e.g. `TestFoo.test_bar()`) are not discoverable, as was the stated reason for diverging from this in the first place. It turns out we can simply `delattr()` these generic tests from `TestFoo` once `TestFooCPU` / `TestFooCUDA` have been setup with the device-specific variants, and all works well. The `instantiate_device_type_tests(...)` logic already deletes `TestFoo` from scope, so I don't see a problem with deleting generic tests from this base class as well (CI will prove me right or wrong ofc). Side note: I was encountering a weird race condition where sometimes the custom `setUpClass()` / `tearDownClass()` defined & swapped in [here](`4a47dd9b3f/torch/testing/_internal/common_device_type.py (L940-L955)`) would be used, and sometimes it wouldn't. This non-deterministic behavior was called out previously by @ngimel here: `4a47dd9b3f/test/inductor/test_torchinductor_dynamic_shapes.py (L128-L130)` To address this, I moved this block of logic to before the first call to `instantiate_test()`, as that method queries for the primary device, and the primary device identification logic may manually invoke `setUpClass()` (see [here](`4a47dd9b3f/torch/testing/_internal/common_device_type.py (L381-L384)`)). Goal: define the `setUpClass()` / `tearDownClass()` we want for correctness before they're ever called. This seems to work and the behavior is deterministic now AFAICT. Pull Request resolved: https://github.com/pytorch/pytorch/pull/151129 Approved by: https://github.com/janeyx99, https://github.com/masnesral, https://github.com/malfet	2025-04-16 02:18:42 +00:00
PyTorch MergeBot	98b1e82ba8	Revert "Fix setUpClass() / tearDownClass() for device-specific tests (#151129 )" This reverts commit bd4cf30e31a2a0b0a57f54c7eedd3a39d5778cbe. Reverted https://github.com/pytorch/pytorch/pull/151129 on behalf of https://github.com/jbschlosser due to flex attention tests failing ([comment](https://github.com/pytorch/pytorch/pull/151129#issuecomment-2807632119))	2025-04-15 22:07:25 +00:00
Joel Schlosser	bd4cf30e31	Fix setUpClass() / tearDownClass() for device-specific tests (#151129 ) Finishes up the work started in #121686 + adds test Update: this was not as straightforward as I originally imagined. Context below. TL;DR: `TestFoo{CPU, CUDA}` now actually derive from `TestFoo`! Also, `{CPU, CUDA}TestBase` setup / teardown logic is now always called (it is required to set the primary device), regardless of whether `super().setUpClass()` / `super().tearDownClass()` are called or not. Background: The typical way to get device-specific tests is to write a generic `TestFoo` and call `instantiate_device_type_tests(TestFoo, locals())` to get `TestFooCPU`, `TestFooCUDA`, etc. After this, generic tests (e.g. `TestFoo.test_bar()`) become `TestFooCPU.test_bar_cpu()` / `TestFooCUDA.test_bar_cuda()`. Behind the scenes, this was historically accomplished by creating a `TestFooCUDA` that derives from both a `CUDATestBase` and an empty class called `TestFoo_base`. This `TestFoo_base` has the same bases as `TestFoo`, but none of the test functions (e.g. `test_bar()`). The documented reason for this is to avoid things like a derived `TestFooCUDA.test_bar()` being discovered in addition to the real device-specific test `TestFooCUDA.test_bar_cuda()`. (1) A reason this matters is because it should be possible to call e.g. `super().setUpClass()` from a custom setup / teardown classmethod. If the generated TestFooCUDA does not derive from TestFoo, but instead derives from the empty class described above, this syntax does not work; in fact there is no way to form a proper `super()` call that works across the device-specific test variants. Here's an example that breaks in the OpInfo tests: `070f389745/test/test_ops.py (L218-L221)` (2) Further, there is some precedent within a custom `setUpClass()` impl for storing things on the `cls` object to be accessed at test time. This must be the device-specific test class (`TestFooCUDA`) and not `TestFoo` for this to work. As an example, the open device registration tests load a module during setup and use it in the test logic: `070f389745/test/test_cpp_extensions_open_device_registration.py (L63-L77)` `070f389745/test/test_cpp_extensions_open_device_registration.py (L79-L80)` To accomplish both (1) and (2) at the same time, I decided to revisit the idea of utilizing a proper inheritance hierarchy for `TestFoo` -> `{TestFooCPU, TestFooCUDA}`. That is: have TestFooCPU / TestFooCUDA actually derive from `TestFoo`. This achieves both (1) and (2). The only thing left is to make sure the generic tests (e.g. `TestFoo.test_bar()`) are not discoverable, as was the stated reason for diverging from this in the first place. It turns out we can simply `delattr()` these generic tests from `TestFoo` once `TestFooCPU` / `TestFooCUDA` have been setup with the device-specific variants, and all works well. The `instantiate_device_type_tests(...)` logic already deletes `TestFoo` from scope, so I don't see a problem with deleting generic tests from this base class as well (CI will prove me right or wrong ofc). Side note: I was encountering a weird race condition where sometimes the custom `setUpClass()` / `tearDownClass()` defined & swapped in [here](`4a47dd9b3f/torch/testing/_internal/common_device_type.py (L940-L955)`) would be used, and sometimes it wouldn't. This non-deterministic behavior was called out previously by @ngimel here: `4a47dd9b3f/test/inductor/test_torchinductor_dynamic_shapes.py (L128-L130)` To address this, I moved this block of logic to before the first call to `instantiate_test()`, as that method queries for the primary device, and the primary device identification logic may manually invoke `setUpClass()` (see [here](`4a47dd9b3f/torch/testing/_internal/common_device_type.py (L381-L384)`)). Goal: define the `setUpClass()` / `tearDownClass()` we want for correctness before they're ever called. This seems to work and the behavior is deterministic now AFAICT. Pull Request resolved: https://github.com/pytorch/pytorch/pull/151129 Approved by: https://github.com/janeyx99, https://github.com/masnesral, https://github.com/malfet	2025-04-15 20:13:26 +00:00
cyy	c6ea4425e5	Enable some tests on Windows (#146243 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/146243 Approved by: https://github.com/albanD	2025-02-05 03:54:28 +00:00
cyy	18380836eb	Remove outdated test skipif conditions for Python3.9 (#146144 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/146144 Approved by: https://github.com/albanD	2025-01-31 19:01:04 +00:00
Xuehai Pan	dcc3cf7066	[BE] fix ruff rule E226: add missing whitespace around operator in f-strings (#144415 ) The fixes are generated by: ```bash ruff check --fix --preview --unsafe-fixes --select=E226 . lintrunner -a --take "RUFF,PYFMT" --all-files ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/144415 Approved by: https://github.com/huydhn, https://github.com/Skylion007	2025-01-08 21:55:00 +00:00
cyy	df458be4e5	[4/N] Apply py39 ruff and pyupgrade fixes (#143257 ) ```torch/fx/passes/annotate_getitem_nodes.py``` was changed to support the new type hinting annotations. Pull Request resolved: https://github.com/pytorch/pytorch/pull/143257 Approved by: https://github.com/justinchuby, https://github.com/albanD	2025-01-04 10:47:51 +00:00
Tom Ritchford	d8c8ba2440	Fix unused Python variables in test/[e-z]* (#136964 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/136964 Approved by: https://github.com/justinchuby, https://github.com/albanD	2024-12-18 23:02:30 +00:00
William Wen	1a44f01beb	[ci, 3.13] update test_testing.py usage of locals() for 3.13 (#141577 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/141577 Approved by: https://github.com/StrongerXi, https://github.com/atalman ghstack dependencies: #141409, #142003, #141572	2024-12-05 00:24:14 +00:00
Joel Schlosser	23d590e518	More flexible test parametrization with @reparametrize (#138369 ) Background: The `@parametrize` decorator enjoys widespread usage as a convenient tool for ensuring extensive test coverage. One particular feature that makes this easy is the ability to stack such decorators, testing over the cross-product of inputs. Example: ```python class MyTestClass(TestCase): @parametrize("x", range(3)) @parametrize("y", [False, True]) def test_foo(self, x, y): # Invoked with: # x=0, y=False # x=1, y=False # x=2, y=False # x=0, y=True # x=1, y=True # x=2, y=True ... ``` Note that the `@ops` and `@modules` decorators employ the same underlying machinery for parametrizing over `OpInfo` / `ModuleInfo` entries. These decorators also parametrize over op-specific `device` / `dtype` info according to what is supported for each op. ```python class MyTestClass(TestCase): @ops(op_db) def test_foo(self, op, device, dtype): # Invoked each OpInfo in the db along with each device / dtype that corresponds # with this op according to the OpInfo entry. ... ``` Note that this in contrast to the naive cross product between ops and devices / dtypes, which would generate too many tests. Certain use cases benefit from a similar type of flexible parametrization that is more intelligent than simple cross-product composition. It is expensive to generate / run too many tests, even if the unneeded ones are skipped appropriately. This PR attempts to generalize such flexible parametrization and satisfy these use cases through the introduction of a `@reparametrize` decorator, which operates on an existing parametrizer and allows for customized on-the-fly parametrization through the use of an `adapter_fn`. Examples: ```python # adapter_fn that adds a new arg def include_is_even_arg(test_name, param_kwargs): x = param_kwargs["x"] is_even = x % 2 == 0 new_param_kwargs = dict(param_kwargs) new_param_kwargs["is_even"] = is_even is_even_suffix = "_even" if is_even else "_odd" new_test_name = f"{test_name}{is_even_suffix}" yield (new_test_name, new_param_kwargs) # adapter_fn that excludes certain values def exclude_odds(test_name, param_kwargs): x = param_kwargs["x"] is_even = x % 2 == 0 yield None if not is_even else (test_name, param_kwargs) class MyTestClass(TestCase): @reparametrize(parametrize("x", range(5)), include_is_even_arg) def test_foo(self, x, is_even): # Invoked with both the x value and the new is_even arg ... @reparametrize(parametrize("x", range(5)), exclude_odds) def test_bar(self, x): # Only invoked with even x values ... ``` For a more real-world use case, imagine you want to write a set of OpInfo tests that parametrize over additional op-specific things beyond `device` / `dtype` (in NJT's case, this includes contiguity type, whether to operate over the batch / ragged / other dims, etc.). The `@reparametrize` decorator allows you to customize the `@ops` parametrization to add in these additional args as they make sense on a per-op basis. Pull Request resolved: https://github.com/pytorch/pytorch/pull/138369 Approved by: https://github.com/janeyx99	2024-10-29 22:14:38 +00:00
Aaron Gokaslan	195d0a666b	[BE][Ez]: Use interned hardcoded string FURB156 (#138330 ) Uses string constants from string module. Pull Request resolved: https://github.com/pytorch/pytorch/pull/138330 Approved by: https://github.com/albanD	2024-10-18 18:26:16 +00:00
Huy Do	8d869c9ec7	Skip test_circular_dependencies on ROCm (#138312 ) The test is flaky on ROCm and has been disabled for quite a while https://github.com/pytorch/pytorch/issues/110040. The disabled issue was opened and then closed several times, so it's better to close that issue and skip the test here. (Not really fix the issue, I just want the test to be skipped on PR instead of being disabled, then close the issue) Fixes https://github.com/pytorch/pytorch/issues/110040 Pull Request resolved: https://github.com/pytorch/pytorch/pull/138312 Approved by: https://github.com/jithunnair-amd, https://github.com/clee2000	2024-10-18 18:17:48 +00:00
Xuehai Pan	758a0a88a2	[BE][Easy] enable `ruff` rule `PIE790`: unnecessary `pass` statement (#133200 ) This PR removes unnecessary `pass` statement. This is semanticly safe because the bytecode for the Python code does not change. Note that if there is a docstring in the function, a empty function does not need a `pass` statement as placeholder. Pull Request resolved: https://github.com/pytorch/pytorch/pull/133200 Approved by: https://github.com/malfet, https://github.com/eqy, https://github.com/kit1980	2024-08-15 15:50:19 +00:00
Justin Chu	57d1ffc512	Ignore `torch.onnx._internal` in `test_circular_dependencies` (#133110 ) Ignore the whole `_internal` module as code will depend on onnxscript and onnx. Pull Request resolved: https://github.com/pytorch/pytorch/pull/133110 Approved by: https://github.com/titaiwangms, https://github.com/malfet	2024-08-15 15:37:24 +00:00
Oguz Ulgen	221350e3a4	Add None return type to init -- tests (#132352 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132352 Approved by: https://github.com/ezyang ghstack dependencies: #132335, #132351	2024-08-01 15:44:51 +00:00
Xuehai Pan	d2bd9acabd	[BE] bump `optree` version to 0.12.1 (#130139 ) 0.12.0 Major Updates: - Add context manager to temporarily set the dictionary sorting mode - Add accessor APIs - Use `stable` tag for `pybind11` for Python 3.13 support - Fix potential segmentation fault for pickling support 0.12.1 Updates: - Fix warning regression during import when launch with strict warning filters Closes #130155 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130139 Approved by: https://github.com/zou3519 ghstack dependencies: #130895	2024-07-20 02:41:10 +00:00
PyTorch MergeBot	074a5c0c9b	Revert "[BE] bump `optree` version to 0.12.1 (#130139 )" This reverts commit 8fcb156e8b5697a8f292db6db2a1803c5f4ce2d7. Reverted https://github.com/pytorch/pytorch/pull/130139 on behalf of https://github.com/clee2000 due to broke inductor/test_torchinductor_codegen_dynamic_shapes.py and test_sympy_utils.py `8fcb156e8b` ([comment](https://github.com/pytorch/pytorch/pull/130139#issuecomment-2229248447))	2024-07-15 19:42:11 +00:00
Xuehai Pan	8fcb156e8b	[BE] bump `optree` version to 0.12.1 (#130139 ) 0.12.0 Major Updates: - Add context manager to temporarily set the dictionary sorting mode - Add accessor APIs - Use `stable` tag for `pybind11` for Python 3.13 support - Fix potential segmentation fault for pickling support 0.12.1 Updates: - Fix warning regression during import when launch with strict warning filters Closes #130155 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130139 Approved by: https://github.com/zou3519	2024-07-15 17:27:07 +00:00
Xuehai Pan	973037be6a	[BE][Easy] apply autofix for ruff rules unnecessary-collection-call (C408): `list()` / `tuple()` / `dict()` (#130199 ) This PR changes the empty collection factory call to Python literals: - `list()` -> `[]` - `tuple()` -> `()` - `dict()` -> `{}` The Python literals are more performant and safer. For example, the bytecode for building an empty dictionary: ```bash $ python3 -m dis - <<EOS import collections d1 = {} d2 = dict() dict = collections.OrderedDict d3 = dict() EOS ``` ```text 0 0 RESUME 0 1 2 LOAD_CONST 0 (0) 4 LOAD_CONST 1 (None) 6 IMPORT_NAME 0 (collections) 8 STORE_NAME 0 (collections) 3 10 BUILD_MAP 0 12 STORE_NAME 1 (d1) 4 14 PUSH_NULL 16 LOAD_NAME 2 (dict) 18 CALL 0 26 STORE_NAME 3 (d2) 6 28 LOAD_NAME 0 (collections) 30 LOAD_ATTR 8 (OrderedDict) 50 STORE_NAME 2 (dict) 7 52 PUSH_NULL 54 LOAD_NAME 2 (dict) 56 CALL 0 64 STORE_NAME 5 (d3) 66 RETURN_CONST 1 (None) ``` The dict literal `{}` only has one bytecode `BUILD_MAP`, while the factory call `dict()` has three `PUSH_NULL + LOAD_NAME + CALL`. Also, the factory call is not safe if users override the `dict` name in `locals` or `globals` (see the example of replacing with `OrderedDict` above). Pull Request resolved: https://github.com/pytorch/pytorch/pull/130199 Approved by: https://github.com/malfet	2024-07-11 17:30:28 +00:00
Peter Bell	90d5a6f001	[inductor] Add lowering and codegen for aten.sort (#128458 ) Closes #125633 Benchmarks: \| Shape \| dim \| stable \| compiled \| eager \| speedup \| \|-------------\|-----\|--------\|----------\|---------\|---------\| \| (256, 4096) \| 0 \| False \| 0.73 ms \| 1.26 ms \| 1.7 \| \| (256, 4096) \| 0 \| True \| 0.75 ms \| 1.27 ms \| 1.7 \| \| (4096, 256) \| 1 \| False \| 0.20 ms \| 0.73 ms \| 3.7 \| \| (4096, 256) \| 1 \| True \| 0.21 ms \| 0.73 ms \| 3.5 \| \| (255, 4096) \| 0 \| False \| 1.05 ms \| 1.48 ms \| 1.4 \| \| (255, 4096) \| 0 \| True \| 1.03 ms \| 1.47 ms \| 1.4 \| \| (4096, 255) \| 1 \| False \| 0.52 ms \| 0.98 ms \| 1.9 \| \| (4096, 255) \| 1 \| True \| 0.54 ms \| 1.00 ms \| 1.9 \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/128458 Approved by: https://github.com/lezcano, https://github.com/eellison	2024-06-26 01:36:39 +00:00
Ke Wen	01601ebd41	Retire torch.distributed.pipeline (#127354 ) Actually retiring module after deprecation warning for a while. The new supported module is: torch.distributed.pipelining. Please migrate. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127354 Approved by: https://github.com/wconstab	2024-06-07 08:11:58 +00:00
PyTorch MergeBot	0ff60236ab	Revert "Retire torch.distributed.pipeline (#127354 )" This reverts commit b9c058c203ee38032594f898f27cd8404f113a63. Reverted https://github.com/pytorch/pytorch/pull/127354 on behalf of https://github.com/huydhn due to Sorry for reverting your change but the doc build failure looks legit `b9c058c203` ([comment](https://github.com/pytorch/pytorch/pull/127354#issuecomment-2148133982))	2024-06-04 18:19:31 +00:00
Ke Wen	b9c058c203	Retire torch.distributed.pipeline (#127354 ) Actually retiring module after deprecation warning for a while. The new supported module is: torch.distributed.pipelining. Please migrate. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127354 Approved by: https://github.com/wconstab	2024-06-04 07:03:26 +00:00
Aaron Gokaslan	5a1216bb2e	[BE]: Update ruff to 0.4.1 (#124549 ) Update ruff to 0.4.1 . This version fixes a lot false negatives/false positives, is 20-40% faster, and has various other bug fixes. Below is a before and after table showing the execution time of ruff lint and ruff format in milliseconds courtesy of https://astral.sh/blog/ruff-v0.4.0 \| Repository \| Linter (v0.3) \| Linter (v0.4) \| Formatter (v0.3) \| Formatter (v0.4) \| \|----------------------------------------------------\|---------------\|---------------\|------------------\|------------------\| \| [pytorch/pytorch](https://github.com/pytorch/pytorch) \| 328.7 \| 251.8 \| 351.1 \| 274.9 \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/124549 Approved by: https://github.com/ezyang	2024-04-21 14:06:23 +00:00
Edward Z. Yang	8aad72b0d3	Support all unsigned int sizes on unique (#123643 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/123643 Approved by: https://github.com/albanD, https://github.com/kit1980	2024-04-11 06:50:12 +00:00
Aaron Gokaslan	1562dae62c	[BE]: Apply RUF025 dict.fromkeys preview rule (#118637 ) Simplifies and optimizes dict construction using the `fromkeys` classmethod ctor. This also makes it really obvious when all the keys will have the same static value, which could be a bug if unintentional. It is also significantly faster than using a dict comprehension. The rule is in preview, but I am adding a forward fix for when it becomes stable. Pull Request resolved: https://github.com/pytorch/pytorch/pull/118637 Approved by: https://github.com/albanD	2024-01-30 20:46:54 +00:00
Aaron Shi	a357a0f315	Back out "[Kineto] Initialize libkineto profilers during torch init process during pybind set-up (#112623 )" (#116201 ) Summary: This diff needs to be backed out because TorchBench llama_v2_7b_16h has a cublas init error. https://github.com/pytorch/benchmark/actions/runs/7266269668/job/19797677485?pr=2095 Test Plan: CI Differential Revision: D52339142 Pull Request resolved: https://github.com/pytorch/pytorch/pull/116201 Approved by: https://github.com/xuzhao9	2023-12-21 16:32:19 +00:00
Aaron Gokaslan	6de28e92d2	[BE]: Apply FURB118 (prev): replaces unnecessary lambdas with operator. (#116027 ) This replaces a bunch of unnecessary lambdas with the operator package. This is semantically equivalent, but the operator package is faster, and arguably more readable. When the FURB rules are taken out of preview, I will enable it as a ruff check. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116027 Approved by: https://github.com/malfet	2023-12-20 19:35:08 +00:00
Joel Schlosser	afdc528520	Print the index and summary of the SampleInput that failed an OpInfo test (#99444 ) Related to the Reproducible Testing BE project. Goal is to print out the sample input that failed an OpInfo test. Crazy idea: to avoid requiring widespread changes across tests that use OpInfo sample inputs, return a new special iterator type from `OpInfo.sample_inputs()`, etc. that tracks the most recent item seen. If a test fails later on, print out this info to identify the sample that failed the test. This solves the problem that the test framework currently has no concept of which sample input is being operated on. This PR contains the following changes: * New `TrackedInputIter` that wraps a sample inputs func iterator and tracks the most recent input seen in a `TrackedInput` structure * The information is stored in a dictionary on the test function itself, mapping `full test ID -> most recent TrackedInput` * To determine the test function that is being run, we do some stack crawling hackery in `extract_test_fn_and_id()` * Above applies only when one of the following is called: `OpInfo.sample_inputs()`, `OpInfo.error_inputs()`, `OpInfo.reference_inputs()`, and `OpInfo.conjugate_sample_inputs()`. This could easily be extended to `ModuleInfo`s and the sparse sample input funcs as well Example output when a sample input causes a failure: ``` ====================================================================== ERROR: test_foo_add_cpu_uint8 (__main__.TestFakeTensorCPU) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 911, in test_wrapper return test(args, kwargs) File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 1097, in only_fn return fn(slf, args, *kwargs) File "/home/jbschlosser/branches/reproducible_testing/test/test_ops.py", line 2211, in test_foo self.fail('Example failure') AssertionError: Example failure The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_utils.py", line 2436, in wrapper method(args, kwargs) File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 414, in instantiated_test result = test(self, param_kwargs) File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 917, in test_wrapper raise Exception( Exception: Caused by sample input at index 2: SampleInput(input=Tensor[size=(5, 1), device="cpu", dtype=torch.uint8], args=TensorList[Tensor[size=(5,), device="cpu", dtype=torch.uint8]], kwargs={}, broadcasts_input=True, name='') To execute this test, run the following from the base repo dir: python test/test_ops.py -k test_foo_add_cpu_uint8 This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 ---------------------------------------------------------------------- ``` This notably doesn't print the actual `SampleInput` values, as that's hard without fully reproducible random sample generation. I went down this path for a while and it seems infeasible without adding an untenable amount of overhead to set the random seed per SampleInput (see https://github.com/pytorch/pytorch/issues/86694#issuecomment-1614943708 for more details). For now, I am settling for at least spitting out the index and some metadata of the `SampleInput`, as it seems better than nothing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99444 Approved by: https://github.com/janeyx99	2023-11-21 23:08:35 +00:00
PyTorch MergeBot	5f0d72124e	Revert "Print the index and summary of the SampleInput that failed an OpInfo test (#99444 )" This reverts commit e7f12b1eb0cedfd20dcb41ea35e21e9a71e3390a. Reverted https://github.com/pytorch/pytorch/pull/99444 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to cause memory leak on CUDA job `e7f12b1eb0` ([comment](https://github.com/pytorch/pytorch/pull/99444#issuecomment-1820491298))	2023-11-21 08:58:54 +00:00
Joel Schlosser	e7f12b1eb0	Print the index and summary of the SampleInput that failed an OpInfo test (#99444 ) Related to the Reproducible Testing BE project. Goal is to print out the sample input that failed an OpInfo test. Crazy idea: to avoid requiring widespread changes across tests that use OpInfo sample inputs, return a new special iterator type from `OpInfo.sample_inputs()`, etc. that tracks the most recent item seen. If a test fails later on, print out this info to identify the sample that failed the test. This solves the problem that the test framework currently has no concept of which sample input is being operated on. This PR contains the following changes: * New `TrackedInputIter` that wraps a sample inputs func iterator and tracks the most recent input seen in a `TrackedInput` structure * The information is stored in a dictionary on the test function itself, mapping `full test ID -> most recent TrackedInput` * To determine the test function that is being run, we do some stack crawling hackery in `extract_test_fn_and_id()` * Above applies only when one of the following is called: `OpInfo.sample_inputs()`, `OpInfo.error_inputs()`, `OpInfo.reference_inputs()`, and `OpInfo.conjugate_sample_inputs()`. This could easily be extended to `ModuleInfo`s and the sparse sample input funcs as well Example output when a sample input causes a failure: ``` ====================================================================== ERROR: test_foo_add_cpu_uint8 (__main__.TestFakeTensorCPU) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 911, in test_wrapper return test(args, kwargs) File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 1097, in only_fn return fn(slf, args, *kwargs) File "/home/jbschlosser/branches/reproducible_testing/test/test_ops.py", line 2211, in test_foo self.fail('Example failure') AssertionError: Example failure The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_utils.py", line 2436, in wrapper method(args, kwargs) File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 414, in instantiated_test result = test(self, param_kwargs) File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 917, in test_wrapper raise Exception( Exception: Caused by sample input at index 2: SampleInput(input=Tensor[size=(5, 1), device="cpu", dtype=torch.uint8], args=TensorList[Tensor[size=(5,), device="cpu", dtype=torch.uint8]], kwargs={}, broadcasts_input=True, name='') To execute this test, run the following from the base repo dir: python test/test_ops.py -k test_foo_add_cpu_uint8 This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 ---------------------------------------------------------------------- ``` This notably doesn't print the actual `SampleInput` values, as that's hard without fully reproducible random sample generation. I went down this path for a while and it seems infeasible without adding an untenable amount of overhead to set the random seed per SampleInput (see https://github.com/pytorch/pytorch/issues/86694#issuecomment-1614943708 for more details). For now, I am settling for at least spitting out the index and some metadata of the `SampleInput`, as it seems better than nothing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99444 Approved by: https://github.com/janeyx99	2023-11-21 00:11:20 +00:00
Philip Meier	769f924bc6	robustify parametrize default name (#113856 ) #113340 was reverted initially due to a bad default parametrization name. The test looked like ```python @common_utils.parametrize( "type_fn", [ type, lambda obj: obj.__class__, ], ) def test_access_class_method_from_user_class(self, type_fn): ``` This is a valid parametrization, but results in these default test names: ```bash ❯ pytest test/dynamo/test_export.py -k test_access_class_method_from_user_class --co -q test/dynamo/test_export.py::ExportTests::test_access_class_method_from_user_class_type_fn_<class 'type'> test/dynamo/test_export.py::ExportTests::test_access_class_method_from_user_class_type_fn_<function ExportTests_<lambda> at 0x7f3be5de0c10> ``` Ignoring the whitespace in the test names, which can lead to other issues down the line, the problem in #113340 was that the lambda parameter included a memory address. IIUC, internally, the tests are not collected and run in the same process. Meaning, the address of the lambda and in turn the test name is no longer valid on the runner. This is fixed earlier in the stack by giving the parametrization an explicit name with `subtest`, but this PR is about preventing issues in the default case. `pytest` solves this by simply using the name of the parameter plus its index as id in the test name: ```python import pytest class Foo: def __repr__(self): return str(id(self)) @pytest.mark.parametrize( "bar", [ pytest.param(type), pytest.param(lambda obj: obj.__class__), pytest.param(Foo()), ], ) def test_foo(bar): pass ``` ``` ❯ pytest main.py --co -q main.py::test_foo[type] main.py::test_foo[<lambda>] main.py::test_foo[bar2] ``` `pytest` has better defaults for `type` and `lambda` than we do, but is has a safe default for custom objects. This PR aligns our default test name with `pytest`. Using the parametrization from above again, we now collect ```bash ❯ pytest test/dynamo/test_export.py -k test_access_class_method_from_user_class --co -q test/dynamo/test_export.py::ExportTests::test_access_class_method_from_user_class_type_fn0 test/dynamo/test_export.py::ExportTests::test_access_class_method_from_user_class_type_fn1 ``` which might not be as expressive at first glance, but at least prevents bugs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113856 Approved by: https://github.com/malfet, https://github.com/huydhn ghstack dependencies: #113855	2023-11-16 23:25:04 +00:00
Aaron Enye Shi	12b2dd16b0	[Kineto] Initialize libkineto profilers during torch init process during pybind set-up (#112623 ) Summary: We are planning to lazily initialize CUPTI when profiling is actually performed. Therefore, we need to remove profiler init dependency on CUPTI Callbacks' RESOURCE_CONTEXT_CREATED. Instead, we can initialize the profilers during torch profiler pybind, ie. THPAutograd_initExtension() and lazily in profilerStep(). Test Plan: CI and ran internally, see internal diff logs. Pulled By: aaronenyeshi Pull Request resolved: https://github.com/pytorch/pytorch/pull/112623 Approved by: https://github.com/albanD	2023-11-15 20:26:13 +00:00
PyTorch MergeBot	5465f2bb6c	Revert "Improves comparison of state dicts for Checkpoint E2E Tests (#113181 )" This reverts commit 8f5fead86ea9a9eac85d20c6aee780e06ce04eb7. Reverted https://github.com/pytorch/pytorch/pull/113181 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it is failing distribute test in trunk `8f5fead86e` with a not defined DTensor error ([comment](https://github.com/pytorch/pytorch/pull/113181#issuecomment-1810925052))	2023-11-14 18:42:40 +00:00
Lucas Pasqualin	8f5fead86e	Improves comparison of state dicts for Checkpoint E2E Tests (#113181 ) Addresses the following comment - https://github.com/pytorch/pytorch/pull/112541#discussion_r1380197424 Changes the comparison of models in the checkpointing E2E test to compare a non-parallelized model against distribued model after training, saving, & loading. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113181 Approved by: https://github.com/fegin	2023-11-14 14:54:40 +00:00
PyTorch MergeBot	4916a7e94f	Revert "[Kineto] Initialize libkineto profilers during torch init process during pybind set-up (#112623 )" This reverts commit a62a88bb84f633581242bd0107e01d2a075884a3. Reverted https://github.com/pytorch/pytorch/pull/112623 on behalf of https://github.com/huydhn due to This break TestCuda::test_lazy_init on ROCm ([comment](https://github.com/pytorch/pytorch/pull/112623#issuecomment-1806597750))	2023-11-11 00:35:56 +00:00
Aaron Enye Shi	a62a88bb84	[Kineto] Initialize libkineto profilers during torch init process during pybind set-up (#112623 ) Summary: We are planning to lazily initialize CUPTI when profiling is actually performed. Therefore, we need to remove profiler init dependency on CUPTI Callbacks' RESOURCE_CONTEXT_CREATED. Instead, we can initialize the profilers during torch profiler pybind, ie. THPAutograd_initExtension() and lazily in profilerStep(). Test Plan: CI and ran internally, see internal diff logs. Differential Revision: D50894961 Pulled By: aaronenyeshi Pull Request resolved: https://github.com/pytorch/pytorch/pull/112623 Approved by: https://github.com/albanD	2023-11-10 20:50:54 +00:00
Nikita Shulga	328a4c5475	[BE] Enhance `OpInfo.supported_dtype` (#111995 ) Current implementation is prone to errors, as it accepts any object, but does not print an error or something if device_type is not recognized. Remediate it by accepting both device-type and device identifies (either `torch.device` instance or "{device_type}:{ordinal}" string Fixes https://github.com/pytorch/pytorch/issues/111179 Pull Request resolved: https://github.com/pytorch/pytorch/pull/111995 Approved by: https://github.com/albanD	2023-10-27 19:42:01 +00:00
lezcano	acd02a60d5	Add a test making sure we are not importing SymPy when importing torch (#112038 ) As per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/112038 Approved by: https://github.com/malfet, https://github.com/peterbell10 ghstack dependencies: #112035, #112036, #112037	2023-10-26 23:32:27 +00:00
Joel Schlosser	42e4c648a2	New @decorateIf decorator for param-specific conditional decoration (#112033 ) Adds a new decorator `@decorateIf(decorator, predicate_fn)`. Examples: ```python from torch.testing._internal.common_utils import decorateIf ... @decorateIf(unittest.skip, lambda params: params["x"] == 2) @parametrize("x", range(5)) def test_foo(self, x): ... @parametrize("x,y", [(1, 'foo'), (2, 'bar'), (3, 'baz')]) @decorateIf( unittest.expectedFailure, lambda params: params["x"] == 3 and params["y"] == "baz" ) def test_bar(self, x, y): ... @decorateIf( unittest.expectedFailure, lambda params: params["op"].name == "add" and params["dtype"] == torch.float16 ) @ops(op_db) def test_op_foo(self, device, dtype, op): ... @decorateIf( unittest.skip, lambda params: params["module_info"].module_cls is torch.nn.Linear and \ params["device"] == "cpu" ) @modules(module_db) def test_module_foo(self, device, dtype, module_info): ... ``` Follow-up for per-param decoration based on https://github.com/pytorch/pytorch/issues/79161#issuecomment-1152487359 Pull Request resolved: https://github.com/pytorch/pytorch/pull/112033 Approved by: https://github.com/clee2000, https://github.com/pmeier	2023-10-26 14:39:59 +00:00
Ying Zhang	a14761b68a	[Inductor CUTLASS backend] Step 1: Inductor config for cuda / cutlass, util functions. (#107802 ) This is the step 1 to add cutlass as an alternative inductor backend. Feature request: https://github.com/pytorch/pytorch/issues/106991. Pull Request resolved: https://github.com/pytorch/pytorch/pull/107802 Approved by: https://github.com/jansel, https://github.com/aakhundov, https://github.com/kadeng	2023-09-12 17:44:32 +00:00
Justin Chu	73e1455327	[BE] Enable ruff's UP rules and autoformat test/ (#105434 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105434 Approved by: https://github.com/albanD	2023-07-19 20:36:06 +00:00
Nikita Shulga	fea683491e	Make `torch._dynamo` lazy-importable (#104368 ) Use [PEP-562](https://peps.python.org/pep-0562) to import `_dynamo` and `_inductor` only when needed. - Remove redundant imports from tests - Add `test_lazy_imports_are_lazy` to make sure they will not get imported by accident <!-- copilot:poem --> ### <samp>🤖 Generated by Copilot at bae8e90</samp> > _Sing, O Muse, of the daring deeds of PyTorch, the swift and fiery_ > _framework of deep learning, that with skill and cunning wrought_ > _many wonders of dynamic compilation, using the hidden powers_ > _of `_dynamo` and `_inductor`, the secret modules of LLVM and MLIR._ Pull Request resolved: https://github.com/pytorch/pytorch/pull/104368 Approved by: https://github.com/msaroufim, https://github.com/albanD	2023-06-29 00:51:59 +00:00
Peter Bell	ed2eb13d76	[inductor] Create triton_helpers module for helper functions (#99880 ) This changes codegen of `torch.prod` from: ```python tl.reduce(tmp2, 1, _prod_accumulate)[:, None] ``` where `_prod_accumulate` is defined elsewhere, to ```python triton_helpers.prod(tmp2, 1)[:, None] ``` A quirk I uncovered though is that `TritonCodeCache` breaks if you define any new symbol beginning with `triton_`, since it assumes that must be the kernel name. Instead, I've made the kernel name an explicit argument to `async_compile.triton` so it doesn't have to guess. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99880 Approved by: https://github.com/ngimel	2023-04-27 15:10:50 +00:00

1 2 3 4

183 Commits