Commit Graph

171 Commits

Author SHA1 Message Date
cyy
df458be4e5 [4/N] Apply py39 ruff and pyupgrade fixes (#143257)
```torch/fx/passes/annotate_getitem_nodes.py``` was changed to support the new type hinting annotations.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143257
Approved by: https://github.com/justinchuby, https://github.com/albanD
2025-01-04 10:47:51 +00:00
d8c8ba2440 Fix unused Python variables in test/[e-z]* (#136964)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136964
Approved by: https://github.com/justinchuby, https://github.com/albanD
2024-12-18 23:02:30 +00:00
1a44f01beb [ci, 3.13] update test_testing.py usage of locals() for 3.13 (#141577)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/141577
Approved by: https://github.com/StrongerXi, https://github.com/atalman
ghstack dependencies: #141409, #142003, #141572
2024-12-05 00:24:14 +00:00
23d590e518 More flexible test parametrization with @reparametrize (#138369)
**Background:** The `@parametrize` decorator enjoys widespread usage as a convenient tool for ensuring extensive test coverage. One particular feature that makes this easy is the ability to stack such decorators, testing over the cross-product of inputs. Example:
```python
class MyTestClass(TestCase):
    @parametrize("x", range(3))
    @parametrize("y", [False, True])
    def test_foo(self, x, y):
        # Invoked with:
        # x=0, y=False
        # x=1, y=False
        # x=2, y=False
        # x=0, y=True
        # x=1, y=True
        # x=2, y=True
        ...
```

Note that the `@ops` and `@modules` decorators employ the same underlying machinery for parametrizing over `OpInfo` / `ModuleInfo` entries. These decorators also parametrize over op-specific `device` / `dtype` info *according to what is supported for each op*.
```python
class MyTestClass(TestCase):
    @ops(op_db)
    def test_foo(self, op, device, dtype):
        # Invoked each OpInfo in the db along with each device / dtype that corresponds
        # with this op according to the OpInfo entry.
        ...
```

Note that this in contrast to the naive cross product between ops and devices / dtypes, which would generate too many tests. Certain use cases benefit from a similar type of flexible parametrization that is more intelligent than simple cross-product composition. It is expensive to generate / run too many tests, even if the unneeded ones are skipped appropriately.

This PR attempts to generalize such flexible parametrization and satisfy these use cases through the introduction of a `@reparametrize` decorator, which operates on an existing parametrizer and allows for customized on-the-fly parametrization through the use of an `adapter_fn`. Examples:
```python
# adapter_fn that adds a new arg
 def include_is_even_arg(test_name, param_kwargs):
    x = param_kwargs["x"]
    is_even = x % 2 == 0
    new_param_kwargs = dict(param_kwargs)
    new_param_kwargs["is_even"] = is_even
    is_even_suffix = "_even" if is_even else "_odd"
    new_test_name = f"{test_name}{is_even_suffix}"
    yield (new_test_name, new_param_kwargs)

# adapter_fn that excludes certain values
def exclude_odds(test_name, param_kwargs):
    x = param_kwargs["x"]
    is_even = x % 2 == 0
    yield None if not is_even else (test_name, param_kwargs)

class MyTestClass(TestCase):
    @reparametrize(parametrize("x", range(5)), include_is_even_arg)
    def test_foo(self, x, is_even):
        # Invoked with both the x value and the new is_even arg
        ...

    @reparametrize(parametrize("x", range(5)), exclude_odds)
    def test_bar(self, x):
        # Only invoked with even x values
        ...
```

For a more real-world use case, imagine you want to write a set of OpInfo tests that parametrize over additional op-specific things beyond `device` / `dtype` (in NJT's case, this includes contiguity type, whether to operate over the batch / ragged / other dims, etc.). The `@reparametrize` decorator allows you to customize the `@ops` parametrization to add in these additional args as they make sense on a per-op basis.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138369
Approved by: https://github.com/janeyx99
2024-10-29 22:14:38 +00:00
195d0a666b [BE][Ez]: Use interned hardcoded string FURB156 (#138330)
Uses string constants from string module.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138330
Approved by: https://github.com/albanD
2024-10-18 18:26:16 +00:00
8d869c9ec7 Skip test_circular_dependencies on ROCm (#138312)
The test is flaky on ROCm and has been disabled for quite a while https://github.com/pytorch/pytorch/issues/110040.  The disabled issue was opened and then closed several times, so it's better to close that issue and skip the test here.

(Not really fix the issue, I just want the test to be skipped on PR instead of being disabled, then close the issue)
Fixes https://github.com/pytorch/pytorch/issues/110040

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138312
Approved by: https://github.com/jithunnair-amd, https://github.com/clee2000
2024-10-18 18:17:48 +00:00
758a0a88a2 [BE][Easy] enable ruff rule PIE790: unnecessary pass statement (#133200)
This PR removes unnecessary `pass` statement. This is semanticly safe because the bytecode for the Python code does not change.

Note that if there is a docstring in the function, a empty function does not need a `pass` statement as placeholder.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133200
Approved by: https://github.com/malfet, https://github.com/eqy, https://github.com/kit1980
2024-08-15 15:50:19 +00:00
57d1ffc512 Ignore torch.onnx._internal in test_circular_dependencies (#133110)
Ignore the whole `_internal` module as code will depend on onnxscript and onnx.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133110
Approved by: https://github.com/titaiwangms, https://github.com/malfet
2024-08-15 15:37:24 +00:00
221350e3a4 Add None return type to init -- tests (#132352)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132352
Approved by: https://github.com/ezyang
ghstack dependencies: #132335, #132351
2024-08-01 15:44:51 +00:00
d2bd9acabd [BE] bump optree version to 0.12.1 (#130139)
0.12.0 Major Updates:

- Add context manager to temporarily set the dictionary sorting mode
- Add accessor APIs
- Use `stable` tag for `pybind11` for Python 3.13 support
- Fix potential segmentation fault for pickling support

0.12.1 Updates:

- Fix warning regression during import when launch with strict warning filters

Closes #130155
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130139
Approved by: https://github.com/zou3519
ghstack dependencies: #130895
2024-07-20 02:41:10 +00:00
074a5c0c9b Revert "[BE] bump optree version to 0.12.1 (#130139)"
This reverts commit 8fcb156e8b5697a8f292db6db2a1803c5f4ce2d7.

Reverted https://github.com/pytorch/pytorch/pull/130139 on behalf of https://github.com/clee2000 due to broke inductor/test_torchinductor_codegen_dynamic_shapes.py and test_sympy_utils.py 8fcb156e8b ([comment](https://github.com/pytorch/pytorch/pull/130139#issuecomment-2229248447))
2024-07-15 19:42:11 +00:00
8fcb156e8b [BE] bump optree version to 0.12.1 (#130139)
0.12.0 Major Updates:

- Add context manager to temporarily set the dictionary sorting mode
- Add accessor APIs
- Use `stable` tag for `pybind11` for Python 3.13 support
- Fix potential segmentation fault for pickling support

0.12.1 Updates:

- Fix warning regression during import when launch with strict warning filters

Closes #130155
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130139
Approved by: https://github.com/zou3519
2024-07-15 17:27:07 +00:00
973037be6a [BE][Easy] apply autofix for ruff rules unnecessary-collection-call (C408): list() / tuple() / dict() (#130199)
This PR changes the empty collection factory call to Python literals:

- `list()` -> `[]`
- `tuple()` -> `()`
- `dict()` -> `{}`

The Python literals are more performant and safer. For example, the bytecode for building an empty dictionary:

```bash
$ python3 -m dis - <<EOS
import collections

d1 = {}
d2 = dict()

dict = collections.OrderedDict
d3 = dict()
EOS
```

```text
  0           0 RESUME                   0

  1           2 LOAD_CONST               0 (0)
              4 LOAD_CONST               1 (None)
              6 IMPORT_NAME              0 (collections)
              8 STORE_NAME               0 (collections)

  3          10 BUILD_MAP                0
             12 STORE_NAME               1 (d1)

  4          14 PUSH_NULL
             16 LOAD_NAME                2 (dict)
             18 CALL                     0
             26 STORE_NAME               3 (d2)

  6          28 LOAD_NAME                0 (collections)
             30 LOAD_ATTR                8 (OrderedDict)
             50 STORE_NAME               2 (dict)

  7          52 PUSH_NULL
             54 LOAD_NAME                2 (dict)
             56 CALL                     0
             64 STORE_NAME               5 (d3)
             66 RETURN_CONST             1 (None)
```

The dict literal `{}` only has one bytecode `BUILD_MAP`, while the factory call `dict()` has three `PUSH_NULL + LOAD_NAME + CALL`. Also, the factory call is not safe if users override the `dict` name in `locals` or `globals` (see the example of replacing with `OrderedDict` above).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130199
Approved by: https://github.com/malfet
2024-07-11 17:30:28 +00:00
90d5a6f001 [inductor] Add lowering and codegen for aten.sort (#128458)
Closes #125633

Benchmarks:
| Shape       | dim | stable | compiled | eager   | speedup |
|-------------|-----|--------|----------|---------|---------|
| (256, 4096) | 0   | False  | 0.73 ms  | 1.26 ms | 1.7     |
| (256, 4096) | 0   | True   | 0.75 ms  | 1.27 ms | 1.7     |
| (4096, 256) | 1   | False  | 0.20 ms  | 0.73 ms | 3.7     |
| (4096, 256) | 1   | True   | 0.21 ms  | 0.73 ms | 3.5     |
| (255, 4096) | 0   | False  | 1.05 ms  | 1.48 ms | 1.4     |
| (255, 4096) | 0   | True   | 1.03 ms  | 1.47 ms | 1.4     |
| (4096, 255) | 1   | False  | 0.52 ms  | 0.98 ms | 1.9     |
| (4096, 255) | 1   | True   | 0.54 ms  | 1.00 ms | 1.9     |

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128458
Approved by: https://github.com/lezcano, https://github.com/eellison
2024-06-26 01:36:39 +00:00
01601ebd41 Retire torch.distributed.pipeline (#127354)
Actually retiring module after deprecation warning for a while.
The new supported module is: torch.distributed.pipelining.
Please migrate.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127354
Approved by: https://github.com/wconstab
2024-06-07 08:11:58 +00:00
0ff60236ab Revert "Retire torch.distributed.pipeline (#127354)"
This reverts commit b9c058c203ee38032594f898f27cd8404f113a63.

Reverted https://github.com/pytorch/pytorch/pull/127354 on behalf of https://github.com/huydhn due to Sorry for reverting your change but the doc build failure looks legit b9c058c203 ([comment](https://github.com/pytorch/pytorch/pull/127354#issuecomment-2148133982))
2024-06-04 18:19:31 +00:00
b9c058c203 Retire torch.distributed.pipeline (#127354)
Actually retiring module after deprecation warning for a while.
The new supported module is: torch.distributed.pipelining.
Please migrate.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127354
Approved by: https://github.com/wconstab
2024-06-04 07:03:26 +00:00
5a1216bb2e [BE]: Update ruff to 0.4.1 (#124549)
Update ruff to 0.4.1 .
This version fixes a lot false negatives/false positives, is 20-40% faster, and has various other bug fixes.

Below is a before and after table showing the execution time of ruff lint and ruff format in milliseconds courtesy of https://astral.sh/blog/ruff-v0.4.0

| Repository                                         | Linter (v0.3) | Linter (v0.4) | Formatter (v0.3) | Formatter (v0.4) |
|----------------------------------------------------|---------------|---------------|------------------|------------------|
| [pytorch/pytorch](https://github.com/pytorch/pytorch) | 328.7         | 251.8         | 351.1            | 274.9            |

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124549
Approved by: https://github.com/ezyang
2024-04-21 14:06:23 +00:00
8aad72b0d3 Support all unsigned int sizes on unique (#123643)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123643
Approved by: https://github.com/albanD, https://github.com/kit1980
2024-04-11 06:50:12 +00:00
1562dae62c [BE]: Apply RUF025 dict.fromkeys preview rule (#118637)
Simplifies and optimizes dict construction using the `fromkeys` classmethod ctor. This also makes it really obvious when all the keys will have the same static value, which could be a bug if unintentional. It is also significantly faster than using a dict comprehension. The rule is in preview, but I am adding a forward fix for when it becomes stable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118637
Approved by: https://github.com/albanD
2024-01-30 20:46:54 +00:00
a357a0f315 Back out "[Kineto] Initialize libkineto profilers during torch init process during pybind set-up (#112623)" (#116201)
Summary:
This diff needs to be backed out because TorchBench llama_v2_7b_16h has a cublas init error.
https://github.com/pytorch/benchmark/actions/runs/7266269668/job/19797677485?pr=2095

Test Plan: CI

Differential Revision: D52339142

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116201
Approved by: https://github.com/xuzhao9
2023-12-21 16:32:19 +00:00
6de28e92d2 [BE]: Apply FURB118 (prev): replaces unnecessary lambdas with operator. (#116027)
This replaces a bunch of unnecessary lambdas with the operator package. This is semantically equivalent, but the operator package is faster, and arguably more readable. When the FURB rules are taken out of preview, I will enable it as a ruff check.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116027
Approved by: https://github.com/malfet
2023-12-20 19:35:08 +00:00
afdc528520 Print the index and summary of the SampleInput that failed an OpInfo test (#99444)
Related to the Reproducible Testing BE project. Goal is to print out the sample input that failed an OpInfo test.

Crazy idea: to avoid requiring widespread changes across tests that use OpInfo sample inputs, return a new special iterator type from `OpInfo.sample_inputs()`, etc. that tracks the most recent item seen. If a test fails later on, print out this info to identify the sample that failed the test.

This solves the problem that the test framework currently has no concept of which sample input is being operated on.

This PR contains the following changes:
* New `TrackedInputIter` that wraps a sample inputs func iterator and tracks the most recent input seen in a `TrackedInput` structure
    * The information is stored in a dictionary on the test function itself, mapping `full test ID -> most recent TrackedInput`
* To determine the test function that is being run, we do some stack crawling hackery in `extract_test_fn_and_id()`
* Above applies only when one of the following is called: `OpInfo.sample_inputs()`, `OpInfo.error_inputs()`, `OpInfo.reference_inputs()`, and `OpInfo.conjugate_sample_inputs()`. This could easily be extended to `ModuleInfo`s and the sparse sample input funcs as well

Example output when a sample input causes a failure:
```
======================================================================
ERROR: test_foo_add_cpu_uint8 (__main__.TestFakeTensorCPU)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 911, in test_wrapper
    return test(*args, **kwargs)
  File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 1097, in only_fn
    return fn(slf, *args, **kwargs)
  File "/home/jbschlosser/branches/reproducible_testing/test/test_ops.py", line 2211, in test_foo
    self.fail('Example failure')
AssertionError: Example failure

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_utils.py", line 2436, in wrapper
    method(*args, **kwargs)
  File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 414, in instantiated_test
    result = test(self, **param_kwargs)
  File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 917, in test_wrapper
    raise Exception(
Exception: Caused by sample input at index 2: SampleInput(input=Tensor[size=(5, 1), device="cpu", dtype=torch.uint8], args=TensorList[Tensor[size=(5,), device="cpu", dtype=torch.uint8]], kwargs={}, broadcasts_input=True, name='')

To execute this test, run the following from the base repo dir:
     python test/test_ops.py -k test_foo_add_cpu_uint8

This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0

----------------------------------------------------------------------
```

This notably doesn't print the actual `SampleInput` values, as that's hard without fully reproducible random sample generation. I went down this path for a while and it seems infeasible without adding an untenable amount of overhead to set the random seed per SampleInput (see https://github.com/pytorch/pytorch/issues/86694#issuecomment-1614943708 for more details). For now, I am settling for at least spitting out the index and some metadata of the `SampleInput`, as it seems better than nothing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99444
Approved by: https://github.com/janeyx99
2023-11-21 23:08:35 +00:00
5f0d72124e Revert "Print the index and summary of the SampleInput that failed an OpInfo test (#99444)"
This reverts commit e7f12b1eb0cedfd20dcb41ea35e21e9a71e3390a.

Reverted https://github.com/pytorch/pytorch/pull/99444 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to cause memory leak on CUDA job e7f12b1eb0 ([comment](https://github.com/pytorch/pytorch/pull/99444#issuecomment-1820491298))
2023-11-21 08:58:54 +00:00
e7f12b1eb0 Print the index and summary of the SampleInput that failed an OpInfo test (#99444)
Related to the Reproducible Testing BE project. Goal is to print out the sample input that failed an OpInfo test.

Crazy idea: to avoid requiring widespread changes across tests that use OpInfo sample inputs, return a new special iterator type from `OpInfo.sample_inputs()`, etc. that tracks the most recent item seen. If a test fails later on, print out this info to identify the sample that failed the test.

This solves the problem that the test framework currently has no concept of which sample input is being operated on.

This PR contains the following changes:
* New `TrackedInputIter` that wraps a sample inputs func iterator and tracks the most recent input seen in a `TrackedInput` structure
    * The information is stored in a dictionary on the test function itself, mapping `full test ID -> most recent TrackedInput`
* To determine the test function that is being run, we do some stack crawling hackery in `extract_test_fn_and_id()`
* Above applies only when one of the following is called: `OpInfo.sample_inputs()`, `OpInfo.error_inputs()`, `OpInfo.reference_inputs()`, and `OpInfo.conjugate_sample_inputs()`. This could easily be extended to `ModuleInfo`s and the sparse sample input funcs as well

Example output when a sample input causes a failure:
```
======================================================================
ERROR: test_foo_add_cpu_uint8 (__main__.TestFakeTensorCPU)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 911, in test_wrapper
    return test(*args, **kwargs)
  File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 1097, in only_fn
    return fn(slf, *args, **kwargs)
  File "/home/jbschlosser/branches/reproducible_testing/test/test_ops.py", line 2211, in test_foo
    self.fail('Example failure')
AssertionError: Example failure

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_utils.py", line 2436, in wrapper
    method(*args, **kwargs)
  File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 414, in instantiated_test
    result = test(self, **param_kwargs)
  File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 917, in test_wrapper
    raise Exception(
Exception: Caused by sample input at index 2: SampleInput(input=Tensor[size=(5, 1), device="cpu", dtype=torch.uint8], args=TensorList[Tensor[size=(5,), device="cpu", dtype=torch.uint8]], kwargs={}, broadcasts_input=True, name='')

To execute this test, run the following from the base repo dir:
     python test/test_ops.py -k test_foo_add_cpu_uint8

This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0

----------------------------------------------------------------------
```

This notably doesn't print the actual `SampleInput` values, as that's hard without fully reproducible random sample generation. I went down this path for a while and it seems infeasible without adding an untenable amount of overhead to set the random seed per SampleInput (see https://github.com/pytorch/pytorch/issues/86694#issuecomment-1614943708 for more details). For now, I am settling for at least spitting out the index and some metadata of the `SampleInput`, as it seems better than nothing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99444
Approved by: https://github.com/janeyx99
2023-11-21 00:11:20 +00:00
769f924bc6 robustify parametrize default name (#113856)
#113340 was reverted initially due to a bad default parametrization name. The test looked like

```python
@common_utils.parametrize(
    "type_fn",
    [
        type,
        lambda obj: obj.__class__,
    ],
)
def test_access_class_method_from_user_class(self, type_fn):
```

This is a valid parametrization, but results in these default test names:

```bash
❯ pytest test/dynamo/test_export.py -k test_access_class_method_from_user_class --co -q
test/dynamo/test_export.py::ExportTests::test_access_class_method_from_user_class_type_fn_<class 'type'>
test/dynamo/test_export.py::ExportTests::test_access_class_method_from_user_class_type_fn_<function ExportTests_<lambda> at 0x7f3be5de0c10>
```

Ignoring the whitespace in the test names, which can lead to other issues down the line, the problem in #113340 was that the lambda parameter included a memory address. IIUC, internally, the tests are not collected and run in the same process. Meaning, the address of the lambda and in turn the test name is no longer valid on the runner. This is fixed earlier in the stack by giving the parametrization an explicit name with `subtest`, but this PR is about preventing issues in the default case.

`pytest` solves this by simply using the name of the parameter plus its index as id in the test name:

```python
import pytest

class Foo:
    def __repr__(self):
        return str(id(self))

@pytest.mark.parametrize(
    "bar",
    [
        pytest.param(type),
        pytest.param(lambda obj: obj.__class__),
        pytest.param(Foo()),
    ],
)
def test_foo(bar):
    pass
```

```
❯ pytest main.py --co -q
main.py::test_foo[type]
main.py::test_foo[<lambda>]
main.py::test_foo[bar2]
```

`pytest` has better defaults for `type` and `lambda` than we do, but is has a safe default for custom objects.

This PR aligns our default test name with `pytest`. Using the parametrization from above again, we now collect

```bash
❯ pytest test/dynamo/test_export.py -k test_access_class_method_from_user_class --co -q
test/dynamo/test_export.py::ExportTests::test_access_class_method_from_user_class_type_fn0
test/dynamo/test_export.py::ExportTests::test_access_class_method_from_user_class_type_fn1
```

which might not be as expressive at first glance, but at least prevents bugs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113856
Approved by: https://github.com/malfet, https://github.com/huydhn
ghstack dependencies: #113855
2023-11-16 23:25:04 +00:00
12b2dd16b0 [Kineto] Initialize libkineto profilers during torch init process during pybind set-up (#112623)
Summary:
We are planning to lazily initialize CUPTI when profiling is actually performed. Therefore, we need to remove profiler init dependency on CUPTI Callbacks' RESOURCE_CONTEXT_CREATED.

Instead, we can initialize the profilers during torch profiler pybind, ie. THPAutograd_initExtension() and lazily in profilerStep().

Test Plan:
CI and ran internally, see internal diff logs.

Pulled By: aaronenyeshi

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112623
Approved by: https://github.com/albanD
2023-11-15 20:26:13 +00:00
5465f2bb6c Revert "Improves comparison of state dicts for Checkpoint E2E Tests (#113181)"
This reverts commit 8f5fead86ea9a9eac85d20c6aee780e06ce04eb7.

Reverted https://github.com/pytorch/pytorch/pull/113181 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it is failing distribute test in trunk 8f5fead86e with a not defined DTensor error ([comment](https://github.com/pytorch/pytorch/pull/113181#issuecomment-1810925052))
2023-11-14 18:42:40 +00:00
8f5fead86e Improves comparison of state dicts for Checkpoint E2E Tests (#113181)
Addresses the following comment - https://github.com/pytorch/pytorch/pull/112541#discussion_r1380197424

Changes the comparison of models in the checkpointing E2E test to compare a non-parallelized model against distribued model after training, saving, & loading.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113181
Approved by: https://github.com/fegin
2023-11-14 14:54:40 +00:00
4916a7e94f Revert "[Kineto] Initialize libkineto profilers during torch init process during pybind set-up (#112623)"
This reverts commit a62a88bb84f633581242bd0107e01d2a075884a3.

Reverted https://github.com/pytorch/pytorch/pull/112623 on behalf of https://github.com/huydhn due to This break TestCuda::test_lazy_init on ROCm ([comment](https://github.com/pytorch/pytorch/pull/112623#issuecomment-1806597750))
2023-11-11 00:35:56 +00:00
a62a88bb84 [Kineto] Initialize libkineto profilers during torch init process during pybind set-up (#112623)
Summary:
We are planning to lazily initialize CUPTI when profiling is actually performed. Therefore, we need to remove profiler init dependency on CUPTI Callbacks' RESOURCE_CONTEXT_CREATED.

Instead, we can initialize the profilers during torch profiler pybind, ie. THPAutograd_initExtension() and lazily in profilerStep().

Test Plan:
CI and ran internally, see internal diff logs.

Differential Revision: D50894961

Pulled By: aaronenyeshi

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112623
Approved by: https://github.com/albanD
2023-11-10 20:50:54 +00:00
328a4c5475 [BE] Enhance OpInfo.supported_dtype (#111995)
Current implementation is prone to errors, as it accepts any object, but does not print an error or something if device_type is not recognized.

Remediate it by accepting both device-type and device identifies (either `torch.device` instance or "{device_type}:{ordinal}" string

Fixes https://github.com/pytorch/pytorch/issues/111179

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111995
Approved by: https://github.com/albanD
2023-10-27 19:42:01 +00:00
acd02a60d5 Add a test making sure we are not importing SymPy when importing torch (#112038)
As per title
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112038
Approved by: https://github.com/malfet, https://github.com/peterbell10
ghstack dependencies: #112035, #112036, #112037
2023-10-26 23:32:27 +00:00
42e4c648a2 New @decorateIf decorator for param-specific conditional decoration (#112033)
Adds a new decorator `@decorateIf(decorator, predicate_fn)`. Examples:
```python
from torch.testing._internal.common_utils import decorateIf
...

@decorateIf(unittest.skip, lambda params: params["x"] == 2)
@parametrize("x", range(5))
def test_foo(self, x):
    ...

@parametrize("x,y", [(1, 'foo'), (2, 'bar'), (3, 'baz')])
@decorateIf(
    unittest.expectedFailure,
    lambda params: params["x"] == 3 and params["y"] == "baz"
)
def test_bar(self, x, y):
    ...

@decorateIf(
    unittest.expectedFailure,
    lambda params: params["op"].name == "add" and params["dtype"] == torch.float16
)
@ops(op_db)
def test_op_foo(self, device, dtype, op):
    ...

@decorateIf(
    unittest.skip,
    lambda params: params["module_info"].module_cls is torch.nn.Linear and \
        params["device"] == "cpu"
)
@modules(module_db)
def test_module_foo(self, device, dtype, module_info):
    ...
```

Follow-up for per-param decoration based on https://github.com/pytorch/pytorch/issues/79161#issuecomment-1152487359
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112033
Approved by: https://github.com/clee2000, https://github.com/pmeier
2023-10-26 14:39:59 +00:00
a14761b68a [Inductor CUTLASS backend] Step 1: Inductor config for cuda / cutlass, util functions. (#107802)
This is the step 1 to add cutlass as an alternative inductor backend.
Feature request: https://github.com/pytorch/pytorch/issues/106991.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107802
Approved by: https://github.com/jansel, https://github.com/aakhundov, https://github.com/kadeng
2023-09-12 17:44:32 +00:00
73e1455327 [BE] Enable ruff's UP rules and autoformat test/ (#105434)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105434
Approved by: https://github.com/albanD
2023-07-19 20:36:06 +00:00
fea683491e Make torch._dynamo lazy-importable (#104368)
Use [PEP-562](https://peps.python.org/pep-0562) to import `_dynamo` and `_inductor` only when needed.

- Remove redundant imports from tests
- Add `test_lazy_imports_are_lazy` to make sure they will not get imported by accident

<!--
copilot:poem
-->
### <samp>🤖 Generated by Copilot at bae8e90</samp>

> _Sing, O Muse, of the daring deeds of PyTorch, the swift and fiery_
> _framework of deep learning, that with skill and cunning wrought_
> _many wonders of dynamic compilation, using the hidden powers_
> _of `_dynamo` and `_inductor`, the secret modules of LLVM and MLIR._

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104368
Approved by: https://github.com/msaroufim, https://github.com/albanD
2023-06-29 00:51:59 +00:00
ed2eb13d76 [inductor] Create triton_helpers module for helper functions (#99880)
This changes codegen of `torch.prod` from:
```python
   tl.reduce(tmp2, 1, _prod_accumulate)[:, None]
```
where `_prod_accumulate` is defined elsewhere, to

```python
   triton_helpers.prod(tmp2, 1)[:, None]
```

A quirk I uncovered though is that `TritonCodeCache` breaks if you
define any new symbol beginning with `triton_`, since it assumes that
must be the kernel name. Instead, I've made the kernel name an
explicit argument to `async_compile.triton` so it doesn't have to guess.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99880
Approved by: https://github.com/ngimel
2023-04-27 15:10:50 +00:00
ecf08a0f8b [ROCm] Enable test_filtering_env_var (#84100)
The test "test_filtering_env_var" requires CPU device_type along with GPU.
Hence enable both device_types for ROCm, since the PYTORCH_TESTING_DEVICE_ONLY_FOR env var will have the same effect as the code being removed, making the latter redundant anyway:
9e81c0c3f4/.jenkins/pytorch/test.sh (L54-L59)
9e81c0c3f4/torch/testing/_internal/common_device_type.py (L626-L634)

Signed-off-by: Jagadish Krishnamoorthy <jagdish.krishna@gmail.com>

Enables the test disabled by #56178

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84100
Approved by: https://github.com/jeffdaily, https://github.com/malfet
2023-04-04 21:49:53 +00:00
129e03905d disallow invalid value ranges in torch.testing.make_tensor (#96334)
Fixes #96179.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96334
Approved by: https://github.com/mruberry
2023-03-24 23:55:17 +00:00
47bfb192a7 deprecate low==high in torch.testing.make_tensor (#96333)
Addresses #96179.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96333
Approved by: https://github.com/mruberry
2023-03-24 23:55:17 +00:00
76fb9a1c7f fix low and high in torch.testing.make_tensor for integral inputs (#96124)
Fixes #96178.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96124
Approved by: https://github.com/mruberry
2023-03-24 23:55:17 +00:00
9029361f24 honor low and high for torch.bool in torch.testing.make_tensor (#96332)
Fixes #96101.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96332
Approved by: https://github.com/mruberry
2023-03-24 23:55:17 +00:00
303eb37e38 QoL improvements for torch.testing.make_tensor (#96125)
Per title. The major ones:

- Enforce keyword only parameters for `_modify_low_high`, which takes 7 parameters.
  28aa2efd14/torch/testing/_creation.py (L147)
  is just impossible to comprehend without multiple trips back and forth.
- Improve the error messages by including the offending values in the message

I'll highlight the smaller ones inline.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96125
Approved by: https://github.com/mruberry
2023-03-24 23:55:17 +00:00
090af4aa71 add proper tests for torch.testing.make_tensor (#96331)
We had some minimal tests for `torch.testing.make_tensor` before, but nothing exhaustive. This lead to quite few edge cases being undetected. This PR adds comprehensive tests and leaves a few FIXMEs in there for behavior that needs to be fixed in `make_tensor`. This will happen in later commits of this stack. Meaning, at the end of this stack, there shouldn't be any FIXME left in the tests added here.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96331
Approved by: https://github.com/mruberry
2023-03-24 23:55:17 +00:00
8aa34602f7 Jetson Update for CI Redo (#94549)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94549
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-02-21 17:13:38 +00:00
889a4640a0 [ONNX] Skip import test for experimental files (#94552)
`torch.onnx._internal.fx` is experimental and is not imported when `import torch`/`import torch.onnx`.
Need to skip it in this test as it depends on `onnx-script`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94552
Approved by: https://github.com/kit1980
2023-02-10 15:58:49 +00:00
748bac8757 [BE]: Apply pyupgrade yield from and unit test alias upgrades (#94309)
Applies some more harmless pyupgrades. This one gets rid of deprecated aliases in unit_tests and more upgrades yield for loops into yield from generators which are more performance and propagates more information / exceptions from original generator. This is the modern recommended way of forwarding generators.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94309
Approved by: https://github.com/albanD
2023-02-07 20:08:58 +00:00
8612ec5b90 Implement hybrid sparse to/from dense conversions. (#90177)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90177
Approved by: https://github.com/cpuhrsch, https://github.com/pearu
2023-01-12 03:31:30 +00:00
1effabe257 Support per-parameter test decoration (#91658)
Continuation of #79979.

Fixes #79161

This PR does the following:
* Expands the `parametrize_fn()` signature from returning a 3-tuple of `(test, test_name, param_kwargs)` to returning a 4-tuple of `(test, test_name, param_kwargs, decorator_fn)`. Expected signature for the addition is `decorator_fn(param_kwargs) -> List[decorator]` i.e. given the full set of test params, return a list of decorators to apply.
    * `modules`, `ops`, and `parametrize` now fit the new signature, returning `decorator_fn`s instead of applying decorators themselves.
    * `instantiate_parametrized_tests()` and `instantiate_device_type_tests()` now call the returned `decorator_fn`, passing in the full set of `param_kwargs` (after composition + `device` / `dtype` additions) and applying the returned decorators.
    * Composing multiple `parametrize_fn`s also composes the corresponding `decorator_fn`s; the composed `decorator_fn` simply concatenates the decorator lists returned by the constituents.
* Expands `DecorateInfo.is_active` to support callables:
```python
DecorateInfo(
    unittest.expectedFailure, "TestOps", "test_python_ref_executor",
    device_type='cuda', active_if=lambda params: params['executor'] == 'nvfuser'
),
```
* Adds several tests to `test/test_testing.py` ensuring proper decoration using `@parametrize`, `@modules`, and `@ops`.
* (minor) Fixes a couple `ModuleInfo` naming oddities uncovered during testing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91658
Approved by: https://github.com/malfet
2023-01-04 21:08:32 +00:00