17687eb792
[BE][4/6] fix typos in test/ (test/inductor/) ( #157638 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157638
Approved by: https://github.com/yewentao256 , https://github.com/jansel
2025-07-06 06:34:25 +00:00
2344eca5eb
Revert "Fix skipIfXpu and skipIfHpu disables tests when used on class ( #151315 )"
...
This reverts commit ee096b89f63394b2c18826288783eef241f3959c.
Reverted https://github.com/pytorch/pytorch/pull/151315 on behalf of https://github.com/jeanschmidt due to Seems to have introduced internal regressions, see [D74668899](https://www.internalfb.com/diff/D74668899 ). @malfet may you help the author get this PR merged? ([comment](https://github.com/pytorch/pytorch/pull/151315#issuecomment-2880203323 ))
2025-05-14 13:15:03 +00:00
ee096b89f6
Fix skipIfXpu and skipIfHpu disables tests when used on class ( #151315 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151315
Approved by: https://github.com/Skylion007 , https://github.com/malfet
2025-05-13 14:44:17 +00:00
d8c8ba2440
Fix unused Python variables in test/[e-z]* ( #136964 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136964
Approved by: https://github.com/justinchuby , https://github.com/albanD
2024-12-18 23:02:30 +00:00
cc6c248919
[Inductor UT] Generalize newly introduced inductor UTs for intel GPU (Part 2) ( #136856 )
...
[Inductor UT] Generalize Newly introduced inductor UTs for intel GPU
reuse `test/inductor/test_inductor_freezing.py`
reuse `test/inductor/test_layout_optim.py`
reuse `test/inductor/test_loop_ordering.py`
reuse `test/inductor/test_memory_planning.py`
reuse `test/inductor/test_padding.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136856
Approved by: https://github.com/EikanWang , https://github.com/etaf , https://github.com/jansel
2024-10-18 03:58:00 +00:00
920f0426ae
Add None return type to init -- tests rest ( #132376 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132376
Approved by: https://github.com/jamesjwu
ghstack dependencies: #132335 , #132351 , #132352
2024-08-01 15:44:51 +00:00
134bc4fc34
[BE][Easy][12/19] enforce style for empty lines in import segments in test/i*/ ( #129763 )
...
See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501 . Most changes are auto-generated by linter.
You can review these PRs via:
```bash
git diff --ignore-all-space --ignore-blank-lines HEAD~1
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129763
Approved by: https://github.com/jansel
2024-07-18 07:49:19 +00:00
b732b52f1e
Revert "[BE][Easy][12/19] enforce style for empty lines in import segments in test/i*/ ( #129763 )"
...
This reverts commit aecc746fccc4495313167e3a7f94210daf457e1d.
Reverted https://github.com/pytorch/pytorch/pull/129763 on behalf of https://github.com/XuehaiPan due to need reland after rerunning lintrunner on main ([comment](https://github.com/pytorch/pytorch/pull/129763#issuecomment-2235736732 ))
2024-07-18 06:39:58 +00:00
aecc746fcc
[BE][Easy][12/19] enforce style for empty lines in import segments in test/i*/ ( #129763 )
...
See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501 . Most changes are auto-generated by linter.
You can review these PRs via:
```bash
git diff --ignore-all-space --ignore-blank-lines HEAD~1
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129763
Approved by: https://github.com/jansel
2024-07-18 05:13:41 +00:00
4cd503c1f3
Enable FX graph cache for a batch of inductor tests ( #121696 )
...
Summary: Get more FX graph cache coverage by enabling it for these unit tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121696
Approved by: https://github.com/eellison
2024-03-14 03:39:59 +00:00
1ce5049692
[inuctor] fix the layout problem for nll_loss2d_backward ( #121173 )
...
Fixes https://github.com/pytorch/pytorch/issues/120759 .
The CUDA implementation of nll_loss2d_backward.default requires that the 'self' tensor to be contiguous. These implicit assumption may be broken by layout optimizations. The fix here is to add the constraint when we explicitly defining the fallback for the op.
Not sure if we can improve the cuda kernel to release the constraints though.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121173
Approved by: https://github.com/jansel , https://github.com/desertfire
2024-03-07 09:05:07 +00:00
86e6497c6f
[Inductor][cuDNN] Disable tf32 in test_mutate_view_for_conv_output ( #120953 )
...
Another disablement of TF32 to unblock #120642
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120953
Approved by: https://github.com/Skylion007
2024-03-01 05:51:29 +00:00
d1d50d2e4c
[Inductor][cuDNN] Disable tf32 in test_mutate_base_for_conv_output ( #120867 )
...
Looks like there is a sum? comparison where TF32 may not provide the necessary accuracy, leading to failures on sm86.
CC @Skylion007 , hopefully this unblocks #120642
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120867
Approved by: https://github.com/Skylion007
2024-02-29 06:59:32 +00:00
1e60174891
Revert "[dynamo] Add run_inductor_tests entrypoint ( #113278 )"
...
This reverts commit b00311ce9e430cf1b98d2103e21ed2179450a424.
Reverted https://github.com/pytorch/pytorch/pull/113278 on behalf of https://github.com/huydhn due to Sorry for reverting your stack, but it is failing to list test internally with buck2 ([comment](https://github.com/pytorch/pytorch/pull/113278#issuecomment-1811646325 ))
2023-11-15 01:19:48 +00:00
b00311ce9e
[dynamo] Add run_inductor_tests entrypoint ( #113278 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113278
Approved by: https://github.com/yanboliang
2023-11-11 08:54:43 +00:00
c9a806be28
[ROCm] enable additional inductor/dynamo UTs ( #104624 )
...
Enables additional inductor UTs on ROCm and un skips outdated skips.
I have also removed a group of failures in `test_torchinductor_opinfo` which are now passing for CUDA and ROCm
```
- # The following 3 tests fail on CUDA with AssertionError: expected size 5==5, stride 5==1 at dim=0
- # linalg._svd's return value has different strides on CUDA vs CPU which causes this
- # In test_meta.py there is a mechanism to skipping strides checks for some ops
- # (including _linalg_svd), possibly we should have something similar here
- "linalg.cond": {f32, f64},
- "linalg.svdvals": {f32, f64},
- "linalg.matrix_rank": {f32, f64},
- "linalg.svd": {f32, f64},
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104624
Approved by: https://github.com/malfet
2023-07-11 20:44:02 +00:00
98f00f881f
[inductor] convert layout of conv weight ahead of time for inference ( #103642 )
...
This PR handles inference. Will do similar thing for training later.
Some manual testing results shows this can improve inference perf by 2-3% (absolute improvement not relative one).
- convmixer: 4.285x -> 4.309x
- resnet50: 2.170x -> 2.203x
The PR is built upon freezing. Since without freezing, the weight input for a conv node may not be a parameter directly but be the output of precision converting ops. It's so much easier to implement this PR after freezing.
Commands
```
TORCHINDUCTOR_FREEZING=1 python benchmarks/dynamo/timm_models.py --backend inductor --amp --performance --only convmixer_768_32 --inference
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103642
Approved by: https://github.com/eellison
2023-06-28 17:42:32 +00:00
ec922efe3b
[inductor] fix a failed test for layout optimization ( #103984 )
...
Summary:
The test fail because a fixed port is used to initialize the process group. That does not work in stress test when multiple instance of the tests are being run concurrently.
Pick a random port and do some small retry for that.
Test Plan:
```
buck2 test 'fbcode//mode/opt' fbcode//caffe2/test/inductor:layout_optim -- --exact 'caffe2/test/inductor:layout_optim - test_mutate_view (caffe2.test.inductor.test_layout_optim.TestLayoutOptim)' --run-disabled --jobs 18 --stress-runs 10 --record-results
```
Differential Revision: D46908114
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103984
Approved by: https://github.com/williamwen42
2023-06-22 19:34:10 +00:00
7a2a006c9e
Remove dynamic_shapes test for inductor static weights ( #103377 )
...
Signed-off-by: Edward Z. Yang <ezyang@meta.com >
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103377
Approved by: https://github.com/anijain2305
2023-06-14 15:00:34 +00:00
a980b19be7
Revert "Remove dynamic_shapes test for inductor static weights ( #103377 )"
...
This reverts commit 53cb1a7d15804fef6eb25cbad8a0380a29f53e8b.
Reverted https://github.com/pytorch/pytorch/pull/103377 on behalf of https://github.com/malfet due to broke lint ([comment](https://github.com/pytorch/pytorch/pull/103377#issuecomment-1591356769 ))
2023-06-14 14:41:13 +00:00
53cb1a7d15
Remove dynamic_shapes test for inductor static weights ( #103377 )
...
Signed-off-by: Edward Z. Yang <ezyang@meta.com >
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103377
Approved by: https://github.com/anijain2305
2023-06-14 13:32:24 +00:00
daf75c0759
[AOTAutograd] compare with stride hints ( #103342 )
...
We previously compare FakeTensor's strides with real tensor's strides. This cause dynamic dimension of FakeTensor being specialized to static int. This may cause a graph specialized for one shape being used by another shape which is wrong.
Use stride hints for the comparison instead.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103342
Approved by: https://github.com/malfet
2023-06-10 06:51:54 +00:00
bf312f2d9d
[inductor] add a few tests to verify view_to_reshape pass is safe ( #103034 )
...
This PR follows up on issue https://github.com/pytorch/pytorch/issues/102229 . I added 2 unit tests and verified that autoaugorad/functionalization already handles view properly. The view_to_reshape pass does not cause an issues.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103034
Approved by: https://github.com/ezyang
2023-06-06 22:32:51 +00:00
86c7652503
[inductor] layout optimization for conv ( #99773 )
...
convolution kernel with channels last runs much faster then kernel with contiguous inputs. The PR leverage that to optimize tensor layouts so we provide 'channels last' inputs to convolution. Some care need to be taken to not convert tensor layout between contiguous and channels last back and forth. Those extra copies hurt performance quite much.
Latest perf number [here](https://hud.pytorch.org/benchmark/compilers?startTime=Wed%2C%2024%20May%202023%2023%3A40%3A37%20GMT&stopTime=Wed%2C%2031%20May%202023%2023%3A40%3A37%20GMT&granularity=hour&suite=torchbench&mode=training&dtype=amp&lBranch=shunting-layout-opt-19&lCommit=baa797fc100688dfb044fbcbdebcfd2591710f78&rBranch=main&rCommit=999bae0f54108ffc5b7cf2524a02a83901554b16 )
- TB: 1.64x -> 1.69x
- HF: 1.79x -> 1.78x (random noise)
- TIMM: 1.51x -> 1.65x
Right now we disable layout optimization for dynamic shape since there is perf loss in that combination. Here is a GH issue to followup: https://github.com/pytorch/pytorch/issues/102670
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99773
Approved by: https://github.com/jansel
2023-06-02 21:08:18 +00:00