16 Commits

Author SHA1 Message Date
d8c8ba2440 Fix unused Python variables in test/[e-z]* (#136964)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136964
Approved by: https://github.com/justinchuby, https://github.com/albanD
2024-12-18 23:02:30 +00:00
0e4d42634e Port Inductor dataclasses to be kw_only (#137768)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/137768
Approved by: https://github.com/ezyang
2024-10-14 10:33:43 +00:00
41977a0531 Revert "Port Inductor dataclasses to be kw_only (#137768)"
This reverts commit 65d665bae5b82a54b819c0c4527e7ccf88d19427.

Reverted https://github.com/pytorch/pytorch/pull/137768 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it seem to fail test_loop_ordering in trunk ([comment](https://github.com/pytorch/pytorch/pull/137768#issuecomment-2409203115))
2024-10-13 22:25:19 +00:00
65d665bae5 Port Inductor dataclasses to be kw_only (#137768)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/137768
Approved by: https://github.com/ezyang
2024-10-13 14:55:45 +00:00
a968576777 Add lowering for aten.searchsorted (#135701)
Adds lowering for `aten.searchsorted`. This entails:

1. Adding support for multi-dimensional bucket tensors to `ops.bucketize`.
2. Adding support for striding to `ops.bucketize`.
3. Adding support for sorting tensors to `ops.bucketize`.
4. Adding a lowering for `aten.searchsorted.Tensor`.
5. Adding a basic decomposition for `aten.searchsorted.Scalar` that calls into the lowering for tensors.
6. Updating the meta-function for `aten.searchsorted` to properly check some of the sizing conditions.

Closes #135873

Differential Revision: [D63766514](https://our.internmc.facebook.com/intern/diff/D63766514)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135701
Approved by: https://github.com/amjames, https://github.com/eellison, https://github.com/davidberard98
2024-10-04 19:26:05 +00:00
134bc4fc34 [BE][Easy][12/19] enforce style for empty lines in import segments in test/i*/ (#129763)
See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter.

You can review these PRs via:

```bash
git diff --ignore-all-space --ignore-blank-lines HEAD~1
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129763
Approved by: https://github.com/jansel
2024-07-18 07:49:19 +00:00
b732b52f1e Revert "[BE][Easy][12/19] enforce style for empty lines in import segments in test/i*/ (#129763)"
This reverts commit aecc746fccc4495313167e3a7f94210daf457e1d.

Reverted https://github.com/pytorch/pytorch/pull/129763 on behalf of https://github.com/XuehaiPan due to need reland after rerunning lintrunner on main ([comment](https://github.com/pytorch/pytorch/pull/129763#issuecomment-2235736732))
2024-07-18 06:39:58 +00:00
aecc746fcc [BE][Easy][12/19] enforce style for empty lines in import segments in test/i*/ (#129763)
See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter.

You can review these PRs via:

```bash
git diff --ignore-all-space --ignore-blank-lines HEAD~1
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129763
Approved by: https://github.com/jansel
2024-07-18 05:13:41 +00:00
cc518ebd38 [Inductor Intel GPU backend Upstream] Reuse inductor test for Intel GPU (PART 2) (#124147)
Reuse Inductor test case for Intel GPU.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124147
Approved by: https://github.com/EikanWang, https://github.com/jansel
2024-06-16 08:07:05 +00:00
48b6c8dbc3 [Inductor] log fusion failure due to index mismatch (#124986)
The scheduler searches for fusion opportunities by looking for common memory access. Two memory access are considered common not only when the buffer name match, but it also requires more things
- index formula matches
- var_ranges matches

In this PR, I want to log all the fusion failures due to mismatch index formula or var_ranges. I also want to further categories the failures. Right now I found the following failure categories
- rand_seed: the index for rand seed access is an integer and different access uses different integer offset
- different numel: this happens for cat operation
- broadcast: e.g. kernel A write a buffer which is broadcasted and read by kernel B
- different loop orders: the major category we want inductor to be able to fuse
- different offset: happens when use a concatenated linear layer to project Q/K/V and then split the result. Each split will point to the same buffer with different offset.
- unknown

My hope is to make sure for the models I tested, there is no fusion failure falling in the unknown category so all the failures are well understood and categories. Right now it's true for BertForMaskedLM ( https://gist.github.com/shunting314/6dc2c903629d342fa63ba731a171adc2  ), DistillGPT2 ( https://gist.github.com/shunting314/145176f2e850103c7fad4ad72f0e200e ) and llm.c ( https://gist.github.com/shunting314/cfc64a326312a889ba55f79bd47b2082 )

For BertForMaskedLM, we found 82 instances of fusion failures and majority of them are due to different loop orders! Studying the log a bit more can help us figure out where all these loop order mismatch comes from in real models.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124986
Approved by: https://github.com/eellison, https://github.com/jansel
2024-05-07 02:29:00 +00:00
535bc71d03 Enable FX graph caching in another batch of inductor tests (#121697)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121697
Approved by: https://github.com/eellison
2024-03-15 19:38:51 +00:00
1e60174891 Revert "[dynamo] Add run_inductor_tests entrypoint (#113278)"
This reverts commit b00311ce9e430cf1b98d2103e21ed2179450a424.

Reverted https://github.com/pytorch/pytorch/pull/113278 on behalf of https://github.com/huydhn due to Sorry for reverting your stack, but it is failing to list test internally with buck2 ([comment](https://github.com/pytorch/pytorch/pull/113278#issuecomment-1811646325))
2023-11-15 01:19:48 +00:00
b00311ce9e [dynamo] Add run_inductor_tests entrypoint (#113278)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113278
Approved by: https://github.com/yanboliang
2023-11-11 08:54:43 +00:00
28d018dafd [inductor] Implement bucketize() for dependencies.py (#105102)
dependencies.py is used for tracking reads and writes, which is used for identifying dependencies between buffers: i.e. if buffer X reads buffer Y, then X depends on Y. ops.bucketize() reads from an offsets tensor, so we should track it in dependencies.py to correctly track dependencies. Since bucketize performs a binary search over the offsets tensor, the dependency is marked as a StarDep to indicate that the entire tensor is needed.

Use case: we find that jagged tensor dense_to_jagged ops - which use bucketize() to map jagged indices to dense indices - perform better if the bucketize() kernel is separated from the gather kernel. Previously, because bucketize() wasn't marked as reading anything, it would just get inlined.

Differential Revision: [D47422704](https://our.internmc.facebook.com/intern/diff/D47422704)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105102
Approved by: https://github.com/eellison
2023-07-17 19:15:00 +00:00
e68cf02420 Revert "[inductor] Implement bucketize() for dependencies.py (#105102)"
This reverts commit cff5d6a22c8326f3d9fcbed2f68c5aaae9f4523a.

Reverted https://github.com/pytorch/pytorch/pull/105102 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/105102#issuecomment-1637261924))
2023-07-17 01:22:19 +00:00
cff5d6a22c [inductor] Implement bucketize() for dependencies.py (#105102)
dependencies.py is used for tracking reads and writes, which is used for identifying dependencies between buffers: i.e. if buffer X reads buffer Y, then X depends on Y. ops.bucketize() reads from an offsets tensor, so we should track it in dependencies.py to correctly track dependencies. Since bucketize performs a binary search over the offsets tensor, the dependency is marked as a StarDep to indicate that the entire tensor is needed.

Use case: we find that jagged tensor dense_to_jagged ops - which use bucketize() to map jagged indices to dense indices - perform better if the bucketize() kernel is separated from the gather kernel. Previously, because bucketize() wasn't marked as reading anything, it would just get inlined.

Differential Revision: [D47422704](https://our.internmc.facebook.com/intern/diff/D47422704)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105102
Approved by: https://github.com/eellison
2023-07-14 19:54:06 +00:00