13 Commits

Author SHA1 Message Date
fa7ae6cdbc can't infer device on benchmarked function with no args or kwargs (#133290)
when we call benchmarker.benchmark(fn, (), {}) it attempts to infer the device from the args and kwargs, which are both empty. in this case the default behavior is to assume CPU, since `is_cpu_device` is implemented as `all([x.device == "cpu" for x in ... if x is Tensor])`, and `all([]) == True`. I've added a PR that makes this raise an error, but we should just fix this one callsite first

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133290
Approved by: https://github.com/eellison
2024-08-13 20:13:44 +00:00
5cb05a82b4 [BC breaking] move benchmarking + prefer inductor path (#132827)
move benchmarking out of `torch._inductor.runtime.runtime_utils` and into `torch._inductor.runtime.benchmarking`, and prefer this path over directly accessing Triton's benchmarking

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132827
Approved by: https://github.com/eellison
2024-08-08 00:47:45 +00:00
53f1f75061 [BE][Inductor] fix do_bench test (#131402)
The test fail internally [T195592444](https://www.internalfb.com/intern/tasks/?t=195592444) (This is meta internal link). But we don't see the failure in OSS.

It turns out that there are 2 issues:
1. `run_test('cuda')` is improperly handled since it tries to import a module named 'cuda' if cuda is available. Since the import fails, all tests in the file are skipped. This hides the failure in OSS. The failure is exposed in internal tests since the main block which runs `run_test('cuda')` is skipped sometimes.
2. fix the real issue that incompatible inputs are provided to `do_bench`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131402
Approved by: https://github.com/eellison
2024-07-23 21:52:35 +00:00
134bc4fc34 [BE][Easy][12/19] enforce style for empty lines in import segments in test/i*/ (#129763)
See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter.

You can review these PRs via:

```bash
git diff --ignore-all-space --ignore-blank-lines HEAD~1
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129763
Approved by: https://github.com/jansel
2024-07-18 07:49:19 +00:00
b732b52f1e Revert "[BE][Easy][12/19] enforce style for empty lines in import segments in test/i*/ (#129763)"
This reverts commit aecc746fccc4495313167e3a7f94210daf457e1d.

Reverted https://github.com/pytorch/pytorch/pull/129763 on behalf of https://github.com/XuehaiPan due to need reland after rerunning lintrunner on main ([comment](https://github.com/pytorch/pytorch/pull/129763#issuecomment-2235736732))
2024-07-18 06:39:58 +00:00
aecc746fcc [BE][Easy][12/19] enforce style for empty lines in import segments in test/i*/ (#129763)
See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter.

You can review these PRs via:

```bash
git diff --ignore-all-space --ignore-blank-lines HEAD~1
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129763
Approved by: https://github.com/jansel
2024-07-18 05:13:41 +00:00
7fd8870e6b [inductor] Refactor runtime files into torch._inductor.runtime (part 3) (#124557)
I am planning to make the compile_worker process not import torch so it can start up much faster.  This stack is prep for that.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124557
Approved by: https://github.com/yanboliang
ghstack dependencies: #124552, #124553
2024-04-22 18:46:24 +00:00
0b90af0bf5 Revert "[inductor] Refactor runtime files into torch._inductor.runtime (part 3) (#124557)"
This reverts commit fcf28b0ad59b1912d5783688b0f25f18b46efeb3.

Reverted https://github.com/pytorch/pytorch/pull/124557 on behalf of https://github.com/jeanschmidt due to There are internal breakages, already discussed with author and he'll FF ([comment](https://github.com/pytorch/pytorch/pull/124552#issuecomment-2070548223))
2024-04-22 18:28:05 +00:00
fcf28b0ad5 [inductor] Refactor runtime files into torch._inductor.runtime (part 3) (#124557)
I am planning to make the compile_worker process not import torch so it can start up much faster.  This stack is prep for that.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124557
Approved by: https://github.com/yanboliang
ghstack dependencies: #124552, #124553
2024-04-22 04:51:15 +00:00
4cd503c1f3 Enable FX graph cache for a batch of inductor tests (#121696)
Summary: Get more FX graph cache coverage by enabling it for these unit tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/121696
Approved by: https://github.com/eellison
2024-03-14 03:39:59 +00:00
1e60174891 Revert "[dynamo] Add run_inductor_tests entrypoint (#113278)"
This reverts commit b00311ce9e430cf1b98d2103e21ed2179450a424.

Reverted https://github.com/pytorch/pytorch/pull/113278 on behalf of https://github.com/huydhn due to Sorry for reverting your stack, but it is failing to list test internally with buck2 ([comment](https://github.com/pytorch/pytorch/pull/113278#issuecomment-1811646325))
2023-11-15 01:19:48 +00:00
b00311ce9e [dynamo] Add run_inductor_tests entrypoint (#113278)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113278
Approved by: https://github.com/yanboliang
2023-11-11 08:54:43 +00:00
b2d764ece0 [Inductor CUTLASS backend] Step 3: autotune_process, and CUDABenchmarkRequest (#107901)
This is the step 3 to add cutlass as an alternative inductor backend.
Full tests can be found from the last PR in the stack.

Feature request: https://github.com/pytorch/pytorch/issues/106991.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107901
Approved by: https://github.com/jansel, https://github.com/aakhundov, https://github.com/kadeng
ghstack dependencies: #107802, #107847
2023-09-12 17:44:36 +00:00