pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Files

Zhang, Jianyi b59f3d3ae0 [Intel GPU] skip a cuda api call in amp to save some host overhead on xpu (#151111 )

This can save ~0.2ms on non cuda devices by skip calling `amp_definitely_not_available()`. It can improve small models in torchbench like lennard_jones on xpu 10% on both eager and inductor in dynamo benchmarks.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151111
Approved by: https://github.com/soulitzer

2025-04-13 06:37:07 +00:00

__init__.py

generalize custom_fwd&custom_bwd to be device-agnostic (#126531 )

2024-05-25 06:48:16 +00:00

autocast_mode.py

[Intel GPU] skip a cuda api call in amp to save some host overhead on xpu (#151111 )

2025-04-13 06:37:07 +00:00

grad_scaler.py

[MPS] grad scaler (#150255 )

2025-04-06 17:06:55 +00:00