Files
pytorch/torch
Boyuan Feng 771f369448 [Inductor] Improve RoPE (#161420)
This PR fuses ROPE from 2 kernels into 1 kernel.

Shape:
```
q: [B, Hq, S, D]
k: [B, Hkv, S, D]
```

`Hq=32, Hkv=8, D=128` following Llama3 setting.

<img width="980" height="624" alt="image" src="https://github.com/user-attachments/assets/652a8227-6f1d-465c-97fd-2b0af41f8ed9" />

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161420
Approved by: https://github.com/shunting314
2025-09-05 20:55:20 +00:00
..
2025-04-27 09:56:42 +00:00
2025-04-27 09:56:42 +00:00
2025-06-14 18:18:43 +00:00