Files
pytorch/test
Xinya Zhang 3cbfbbd691 [ROCm] Transformer/SDPA unit test parity (#163745)
## Major Changes

* Efficient Attention on ROCM requires last dimensions of input tensors align with 16 bytes.
  - Unlike FA, ME does not pad input tensors in `scaled_dot_product_attention` and hence this is required.
* Fix `atomic_counter` handling in varlen FA API
* Unskips a few unit tests.

Fixes #157120
Fixes #157121
Fixes #157122
Fixes #157167
Fixes #155217
Fixes #157043
Fixes #157060

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163745
Approved by: https://github.com/jeffdaily
2025-09-25 17:14:19 +00:00
..
2025-09-16 17:42:19 +00:00
2025-09-15 11:31:37 +00:00
2025-09-18 12:08:13 +00:00