Files
DeepSpeed/deepspeed
Masahiro Tanaka 71d077da73 Enable grad scaler for ZeRO-0 + torch.autocast path (#7619)
Currently, the DeepSpeed engine does not enable the grad scaler for the
ZeRO-0 and `torch.autocast` path, even when dtype is set to `fp16`. This
leads to errors in tests when we replace our hard-coded tolerances with
PyTorch’s [standard
tolerances](https://docs.pytorch.org/docs/stable/testing.html#torch.testing.assert_close)
(Thank you @stas00 for you suggestion regarding the previous PR).

This PR enables the grad scaler for this path to improve accuracy, and
refactors the tests to simplify validation by using
`torch.testing.assert_close`. The tests now rely on PyTorch’s standard
(and stricter) tolerances, and they still pass.

---------

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>
Co-authored-by: Olatunji Ruwase <tunji.ruwase@snowflake.com>
2025-10-04 13:21:08 +00:00
..
2025-09-01 01:12:40 +00:00
2025-08-16 18:22:19 +00:00
2024-11-06 18:57:12 +00:00
2025-06-06 18:49:41 -04:00
2025-08-16 18:22:19 +00:00
2025-08-16 18:22:19 +00:00
2023-06-02 00:47:14 +00:00
2025-08-16 18:22:19 +00:00
2025-09-24 13:09:23 +00:00
2025-10-03 19:30:26 -07:00
2025-06-06 18:49:41 -04:00
2025-08-16 18:22:19 +00:00