d40a0f5de8
Add dependency for deepcompile test ( #7558 )
...
This PR adds dependency to CI tests for DeepCompile.
---------
Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com >
2025-09-13 00:45:08 -07:00
b9bd03a2ec
Move modal tests to tests/v1 ( #7557 )
...
This PR moves active tests under `tests/unit/v1` to clarify which tests
are run on modal.
---------
Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com >
2025-09-12 17:28:47 -04:00
889f0ead27
Enable non-ZeRO mode ( #7515 )
...
Enabled via `stage=0` which corresponds to DDP.
Remove hardwired path to b16_optimizer.
Enable`torch.autocast` for DDP training
Enable native mixed precision DDP for bfloat16
Update torch.autocast and native mixed precision UTs
<img width="976" height="184" alt="image"
src="https://github.com/user-attachments/assets/92904cdc-e312-46a4-943f-011eb5ab146a "
/>
---------
Signed-off-by: Olatunji Ruwase <tunji.ruwase@snowflake.com >
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
2025-08-27 14:07:29 -04:00
a12de38db6
Modal CI ( #7289 )
...
This is an initial effort to migrate CI unto Modal infra. This PR
creates two new workflows that run on Modal
1. modal-torch-latest: a subset of nv-torch-latest-v100 that includes
`tests/unit/runtime/zero/test_zero.py`.
2. modal-accelerate: a full copy of nv-accelerate-v100.
Follow up PRs will selectively migrate relevant workflows onto Modal.
---------
Signed-off-by: Olatunji Ruwase <tunji.ruwase@snowflake.com >
Signed-off-by: Olatunji Ruwase <tjruwase@gmail.com >
Signed-off-by: Tunji Ruwase <tunji.ruwase@snowflake.com >
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
Co-authored-by: Logan Adams <loadams@microsoft.com >
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com >
Co-authored-by: Olatunji Ruwase <tjruwase@gmail.com >
Co-authored-by: Stas Bekman <stas.bekman@snowflake.com >
2025-08-11 20:13:39 +00:00