Update ruff to 0.4.1 .
This version fixes a lot false negatives/false positives, is 20-40% faster, and has various other bug fixes.
Below is a before and after table showing the execution time of ruff lint and ruff format in milliseconds courtesy of https://astral.sh/blog/ruff-v0.4.0
| Repository | Linter (v0.3) | Linter (v0.4) | Formatter (v0.3) | Formatter (v0.4) |
|----------------------------------------------------|---------------|---------------|------------------|------------------|
| [pytorch/pytorch](https://github.com/pytorch/pytorch) | 328.7 | 251.8 | 351.1 | 274.9 |
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124549
Approved by: https://github.com/ezyang
Simplifies and optimizes dict construction using the `fromkeys` classmethod ctor. This also makes it really obvious when all the keys will have the same static value, which could be a bug if unintentional. It is also significantly faster than using a dict comprehension. The rule is in preview, but I am adding a forward fix for when it becomes stable.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118637
Approved by: https://github.com/albanD
This replaces a bunch of unnecessary lambdas with the operator package. This is semantically equivalent, but the operator package is faster, and arguably more readable. When the FURB rules are taken out of preview, I will enable it as a ruff check.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116027
Approved by: https://github.com/malfet
Related to the Reproducible Testing BE project. Goal is to print out the sample input that failed an OpInfo test.
Crazy idea: to avoid requiring widespread changes across tests that use OpInfo sample inputs, return a new special iterator type from `OpInfo.sample_inputs()`, etc. that tracks the most recent item seen. If a test fails later on, print out this info to identify the sample that failed the test.
This solves the problem that the test framework currently has no concept of which sample input is being operated on.
This PR contains the following changes:
* New `TrackedInputIter` that wraps a sample inputs func iterator and tracks the most recent input seen in a `TrackedInput` structure
* The information is stored in a dictionary on the test function itself, mapping `full test ID -> most recent TrackedInput`
* To determine the test function that is being run, we do some stack crawling hackery in `extract_test_fn_and_id()`
* Above applies only when one of the following is called: `OpInfo.sample_inputs()`, `OpInfo.error_inputs()`, `OpInfo.reference_inputs()`, and `OpInfo.conjugate_sample_inputs()`. This could easily be extended to `ModuleInfo`s and the sparse sample input funcs as well
Example output when a sample input causes a failure:
```
======================================================================
ERROR: test_foo_add_cpu_uint8 (__main__.TestFakeTensorCPU)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 911, in test_wrapper
return test(*args, **kwargs)
File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 1097, in only_fn
return fn(slf, *args, **kwargs)
File "/home/jbschlosser/branches/reproducible_testing/test/test_ops.py", line 2211, in test_foo
self.fail('Example failure')
AssertionError: Example failure
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_utils.py", line 2436, in wrapper
method(*args, **kwargs)
File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 414, in instantiated_test
result = test(self, **param_kwargs)
File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 917, in test_wrapper
raise Exception(
Exception: Caused by sample input at index 2: SampleInput(input=Tensor[size=(5, 1), device="cpu", dtype=torch.uint8], args=TensorList[Tensor[size=(5,), device="cpu", dtype=torch.uint8]], kwargs={}, broadcasts_input=True, name='')
To execute this test, run the following from the base repo dir:
python test/test_ops.py -k test_foo_add_cpu_uint8
This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
----------------------------------------------------------------------
```
This notably doesn't print the actual `SampleInput` values, as that's hard without fully reproducible random sample generation. I went down this path for a while and it seems infeasible without adding an untenable amount of overhead to set the random seed per SampleInput (see https://github.com/pytorch/pytorch/issues/86694#issuecomment-1614943708 for more details). For now, I am settling for at least spitting out the index and some metadata of the `SampleInput`, as it seems better than nothing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99444
Approved by: https://github.com/janeyx99
Related to the Reproducible Testing BE project. Goal is to print out the sample input that failed an OpInfo test.
Crazy idea: to avoid requiring widespread changes across tests that use OpInfo sample inputs, return a new special iterator type from `OpInfo.sample_inputs()`, etc. that tracks the most recent item seen. If a test fails later on, print out this info to identify the sample that failed the test.
This solves the problem that the test framework currently has no concept of which sample input is being operated on.
This PR contains the following changes:
* New `TrackedInputIter` that wraps a sample inputs func iterator and tracks the most recent input seen in a `TrackedInput` structure
* The information is stored in a dictionary on the test function itself, mapping `full test ID -> most recent TrackedInput`
* To determine the test function that is being run, we do some stack crawling hackery in `extract_test_fn_and_id()`
* Above applies only when one of the following is called: `OpInfo.sample_inputs()`, `OpInfo.error_inputs()`, `OpInfo.reference_inputs()`, and `OpInfo.conjugate_sample_inputs()`. This could easily be extended to `ModuleInfo`s and the sparse sample input funcs as well
Example output when a sample input causes a failure:
```
======================================================================
ERROR: test_foo_add_cpu_uint8 (__main__.TestFakeTensorCPU)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 911, in test_wrapper
return test(*args, **kwargs)
File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 1097, in only_fn
return fn(slf, *args, **kwargs)
File "/home/jbschlosser/branches/reproducible_testing/test/test_ops.py", line 2211, in test_foo
self.fail('Example failure')
AssertionError: Example failure
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_utils.py", line 2436, in wrapper
method(*args, **kwargs)
File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 414, in instantiated_test
result = test(self, **param_kwargs)
File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 917, in test_wrapper
raise Exception(
Exception: Caused by sample input at index 2: SampleInput(input=Tensor[size=(5, 1), device="cpu", dtype=torch.uint8], args=TensorList[Tensor[size=(5,), device="cpu", dtype=torch.uint8]], kwargs={}, broadcasts_input=True, name='')
To execute this test, run the following from the base repo dir:
python test/test_ops.py -k test_foo_add_cpu_uint8
This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
----------------------------------------------------------------------
```
This notably doesn't print the actual `SampleInput` values, as that's hard without fully reproducible random sample generation. I went down this path for a while and it seems infeasible without adding an untenable amount of overhead to set the random seed per SampleInput (see https://github.com/pytorch/pytorch/issues/86694#issuecomment-1614943708 for more details). For now, I am settling for at least spitting out the index and some metadata of the `SampleInput`, as it seems better than nothing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99444
Approved by: https://github.com/janeyx99
#113340 was reverted initially due to a bad default parametrization name. The test looked like
```python
@common_utils.parametrize(
"type_fn",
[
type,
lambda obj: obj.__class__,
],
)
def test_access_class_method_from_user_class(self, type_fn):
```
This is a valid parametrization, but results in these default test names:
```bash
❯ pytest test/dynamo/test_export.py -k test_access_class_method_from_user_class --co -q
test/dynamo/test_export.py::ExportTests::test_access_class_method_from_user_class_type_fn_<class 'type'>
test/dynamo/test_export.py::ExportTests::test_access_class_method_from_user_class_type_fn_<function ExportTests_<lambda> at 0x7f3be5de0c10>
```
Ignoring the whitespace in the test names, which can lead to other issues down the line, the problem in #113340 was that the lambda parameter included a memory address. IIUC, internally, the tests are not collected and run in the same process. Meaning, the address of the lambda and in turn the test name is no longer valid on the runner. This is fixed earlier in the stack by giving the parametrization an explicit name with `subtest`, but this PR is about preventing issues in the default case.
`pytest` solves this by simply using the name of the parameter plus its index as id in the test name:
```python
import pytest
class Foo:
def __repr__(self):
return str(id(self))
@pytest.mark.parametrize(
"bar",
[
pytest.param(type),
pytest.param(lambda obj: obj.__class__),
pytest.param(Foo()),
],
)
def test_foo(bar):
pass
```
```
❯ pytest main.py --co -q
main.py::test_foo[type]
main.py::test_foo[<lambda>]
main.py::test_foo[bar2]
```
`pytest` has better defaults for `type` and `lambda` than we do, but is has a safe default for custom objects.
This PR aligns our default test name with `pytest`. Using the parametrization from above again, we now collect
```bash
❯ pytest test/dynamo/test_export.py -k test_access_class_method_from_user_class --co -q
test/dynamo/test_export.py::ExportTests::test_access_class_method_from_user_class_type_fn0
test/dynamo/test_export.py::ExportTests::test_access_class_method_from_user_class_type_fn1
```
which might not be as expressive at first glance, but at least prevents bugs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113856
Approved by: https://github.com/malfet, https://github.com/huydhn
ghstack dependencies: #113855
Summary:
We are planning to lazily initialize CUPTI when profiling is actually performed. Therefore, we need to remove profiler init dependency on CUPTI Callbacks' RESOURCE_CONTEXT_CREATED.
Instead, we can initialize the profilers during torch profiler pybind, ie. THPAutograd_initExtension() and lazily in profilerStep().
Test Plan:
CI and ran internally, see internal diff logs.
Pulled By: aaronenyeshi
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112623
Approved by: https://github.com/albanD
Summary:
We are planning to lazily initialize CUPTI when profiling is actually performed. Therefore, we need to remove profiler init dependency on CUPTI Callbacks' RESOURCE_CONTEXT_CREATED.
Instead, we can initialize the profilers during torch profiler pybind, ie. THPAutograd_initExtension() and lazily in profilerStep().
Test Plan:
CI and ran internally, see internal diff logs.
Differential Revision: D50894961
Pulled By: aaronenyeshi
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112623
Approved by: https://github.com/albanD
Use [PEP-562](https://peps.python.org/pep-0562) to import `_dynamo` and `_inductor` only when needed.
- Remove redundant imports from tests
- Add `test_lazy_imports_are_lazy` to make sure they will not get imported by accident
<!--
copilot:poem
-->
### <samp>🤖 Generated by Copilot at bae8e90</samp>
> _Sing, O Muse, of the daring deeds of PyTorch, the swift and fiery_
> _framework of deep learning, that with skill and cunning wrought_
> _many wonders of dynamic compilation, using the hidden powers_
> _of `_dynamo` and `_inductor`, the secret modules of LLVM and MLIR._
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104368
Approved by: https://github.com/msaroufim, https://github.com/albanD
This changes codegen of `torch.prod` from:
```python
tl.reduce(tmp2, 1, _prod_accumulate)[:, None]
```
where `_prod_accumulate` is defined elsewhere, to
```python
triton_helpers.prod(tmp2, 1)[:, None]
```
A quirk I uncovered though is that `TritonCodeCache` breaks if you
define any new symbol beginning with `triton_`, since it assumes that
must be the kernel name. Instead, I've made the kernel name an
explicit argument to `async_compile.triton` so it doesn't have to guess.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99880
Approved by: https://github.com/ngimel
We had some minimal tests for `torch.testing.make_tensor` before, but nothing exhaustive. This lead to quite few edge cases being undetected. This PR adds comprehensive tests and leaves a few FIXMEs in there for behavior that needs to be fixed in `make_tensor`. This will happen in later commits of this stack. Meaning, at the end of this stack, there shouldn't be any FIXME left in the tests added here.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96331
Approved by: https://github.com/mruberry
Applies some more harmless pyupgrades. This one gets rid of deprecated aliases in unit_tests and more upgrades yield for loops into yield from generators which are more performance and propagates more information / exceptions from original generator. This is the modern recommended way of forwarding generators.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94309
Approved by: https://github.com/albanD
Continuation of #79979.
Fixes#79161
This PR does the following:
* Expands the `parametrize_fn()` signature from returning a 3-tuple of `(test, test_name, param_kwargs)` to returning a 4-tuple of `(test, test_name, param_kwargs, decorator_fn)`. Expected signature for the addition is `decorator_fn(param_kwargs) -> List[decorator]` i.e. given the full set of test params, return a list of decorators to apply.
* `modules`, `ops`, and `parametrize` now fit the new signature, returning `decorator_fn`s instead of applying decorators themselves.
* `instantiate_parametrized_tests()` and `instantiate_device_type_tests()` now call the returned `decorator_fn`, passing in the full set of `param_kwargs` (after composition + `device` / `dtype` additions) and applying the returned decorators.
* Composing multiple `parametrize_fn`s also composes the corresponding `decorator_fn`s; the composed `decorator_fn` simply concatenates the decorator lists returned by the constituents.
* Expands `DecorateInfo.is_active` to support callables:
```python
DecorateInfo(
unittest.expectedFailure, "TestOps", "test_python_ref_executor",
device_type='cuda', active_if=lambda params: params['executor'] == 'nvfuser'
),
```
* Adds several tests to `test/test_testing.py` ensuring proper decoration using `@parametrize`, `@modules`, and `@ops`.
* (minor) Fixes a couple `ModuleInfo` naming oddities uncovered during testing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91658
Approved by: https://github.com/malfet
There was a lot of strangeness in how AOTAutograd backends were previously defined. This refactor replaces the strangeness with something simple and straightforward. The improvements:
- There is no longer a footgun aot_autograd "backend" which doesn't actually work. No more mistyping `torch._dynamo.optimize("aot_autograd")` when you meant "aot_eager"
- Deleted aot_print because it's annoying and anyway there's no uses of it
- Instead of having BOTH the backend Subgraph and AotAutogradStrategy, there is now only an aot_autograd function which takes the kwargs to configure AOTAutograd, and then gives you a compiler function that does AOTAutograd given those kwargs. Easy.
- The primary downside is that we are now eagerly populating all of the kwargs, and that can get us into import cycle shenanigans. Some cycles I resolved directly (e.g., we now no longer manually disable the forward function before passing it to aot_autograd; aot_autograd it does it for us), but for getting inductor decompositions I had to make it take a lambda so I could lazily populate the decomps later.
New code is 130 lines shorter!
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89736
Approved by: https://github.com/anjali411, https://github.com/albanD
Hybrid sparse CSR tensors can currently not be compared to strided ones since `.to_dense` does not work:
```py
import torch
from torch.testing._internal.common_utils import TestCase
assertEqual = TestCase().assertEqual
actual = torch.sparse_csr_tensor([0, 2, 4], [0, 1, 0, 1], [[1, 11], [2, 12] ,[3, 13] ,[4, 14]])
expected = torch.stack([actual[0].to_dense(), actual[1].to_dense()])
assertEqual(actual, expected)
```
```
main.py:4: UserWarning: Sparse CSR tensor support is in beta state. If you miss a functionality in the sparse tensor support, please submit a feature request to https://github.com/pytorch/pytorch/issues. (Triggered internally at ../aten/src/ATen/SparseCsrTensorImpl.cpp:54.)
actual = torch.sparse_csr_tensor([0, 2, 4], [0, 1, 0, 1], [[1, 11], [2, 12] ,[3, 13] ,[4, 14]])
Traceback (most recent call last):
File "/home/philip/git/pytorch/torch/torch/testing/_comparison.py", line 1098, in assert_equal
pair.compare()
File "/home/philip/git/pytorch/torch/torch/testing/_comparison.py", line 619, in compare
actual, expected = self._equalize_attributes(actual, expected)
File "/home/philip/git/pytorch/torch/torch/testing/_comparison.py", line 706, in _equalize_attributes
actual = actual.to_dense() if actual.layout != torch.strided else actual
RuntimeError: sparse_compressed_to_dense: Hybrid tensors are not supported
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "main.py", line 10, in <module>
assertEqual(actual, expected)
File "/home/philip/git/pytorch/torch/torch/testing/_internal/common_utils.py", line 2503, in assertEqual
msg=(lambda generated_msg: f"{generated_msg}\n{msg}") if isinstance(msg, str) and self.longMessage else msg,
File "/home/philip/git/pytorch/torch/torch/testing/_comparison.py", line 1112, in assert_equal
) from error
RuntimeError: Comparing
TensorOrArrayPair(
id=(),
actual=tensor(crow_indices=tensor([0, 2, 4]),
col_indices=tensor([0, 1, 0, 1]),
values=tensor([[ 1, 11],
[ 2, 12],
[ 3, 13],
[ 4, 14]]), size=(2, 2, 2), nnz=4,
layout=torch.sparse_csr),
expected=tensor([[[ 1, 11],
[ 2, 12]],
[[ 3, 13],
[ 4, 14]]]),
rtol=0.0,
atol=0.0,
equal_nan=True,
check_device=False,
check_dtype=True,
check_layout=False,
check_stride=False,
check_is_coalesced=False,
)
resulted in the unexpected exception above. If you are a user and see this message during normal operation please file an issue at https://github.com/pytorch/pytorch/issues. If you are a developer and working on the comparison functions, please except the previous error and raise an expressive `ErrorMeta` instead.
```
This adds a temporary hack to `TestCase.assertEqual` to enable this. Basically, we are going through the individual CSR subtensors, call `.to_dense()` on them, and stack everything back together. I opted to not do this in the common machinery, since that way users are not affected by this (undocumented) hack.
I also added an xfailed test that will trigger as soon as the behavior is supported natively so we don't forget to remove the hack when it is no longer needed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88749
Approved by: https://github.com/mruberry, https://github.com/pearu
`Sparsity` as a term doesn't reflect the tools that are developed by the AO. The `torch/ao/sparsity` also has utilities for structured pruning, which internally we always referred to as just "pruning". To avoid any confusion, we renamed `Sparsity` to `Prune`. We will not be introducing the backwards compatibility, as so far this toolset was kept under silent development.
This change will reflect the changes in the documentation as well.
**TODO:**
- [ ] Change the tutorials
- [ ] Confirm no bc-breakages
- [ ] Reflect the changes in the trackers and RFC docs
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84867
Approved by: https://github.com/supriyar
Splitting into a seperate PR in case of bike shedding. We can't use
the normal fluent syntax `SampleInput(x).name("foo")` because `.name`
is already how the metadata is accessed. So instead, this adds a
single function where you pass keyword arguments to fill in the
metadata, e.g.
```
SampleInput(x).with_metadata(
name="foo", output_process_fn_grad=out_fn)
```
An alternative closer to the normal fluent style would be to adding a
prefix to the property's name, e.g.
```
(SampleInput(x)
.with_name("foo")
.with_output_process_fn_grad(out_fn))
```
However, I have a slight preference for the `with_metadata` style
because you don't need to add extra parenthesis to break lines.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85890
Approved by: https://github.com/mruberry
Most SampleInput objects currently have no additional metadata,
meaning they have a 1:1 mapping with a normal function call. This adds
var arg forms of the `SampleInput` constructor such that you can just
call the `SampleInput` constructor as you would call the operator.
So, for example
```python
SampleInput(make_arg(shape), args=(2, 3), kwargs=dict(alpha=4))
```
becomes
```python
SampleInput(make_arg(shape), 2, 3, alpha=4)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85723
Approved by: https://github.com/mruberry
Ref #82518
Starting small to minimize merge conflicts, this moves the top-level
class definitions and some helper functions into the `opinfos` folder.
It also brings `common_methods_invocations.py` to just below 1MB.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82540
Approved by: https://github.com/albanD
Lightning callback that enables post-training sparsity.
This callback aims to sparsify the model inside lightning module after training.
**Note that the model is copied and then sparsified, so the existing model is not modified**
The sparsified model can be used for comparison and can be accessed using <callback_obj>.sparsified
Test Plan
```python torch/ao/sparsity/_experimental/data_sparsifier/lightning/tests/test_callbacks.py TestPostTrainingCallback```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80370
Approved by: https://github.com/z-a-f