16 Commits

Author SHA1 Message Date
6b731c5c96 scripts: Check .is_cuda only in non-C++ files (#7561)
The check-torchcuda.py today will search for all occurrences of .is_cuda
in the repository when a commit only modifies C++ headers and sources,
which I believe is not intended.

Check usage of .is_cuda only when a commit modifies any non-C++ file.

Signed-off-by: Junjie Mao <junjie.mao@linux.alibaba.com>
2025-09-19 05:01:50 +00:00
4d00b38ada Ulysses SP for HF Integration (#7268)
This is the Deepspeed counterpart of
https://github.com/snowflakedb/ArcticTraining/pull/45 - as the new
feature(s) require changes on both sides.


For PR reviewers: 

Readiness status:
- [x] Code
- [x] Tests
- [ ] Docs - working on it


Features:

- [x] add support for delaying grad addition via
`param.ds_grad_is_ready` flag (used when performing tiled compute in an
autograd function)
- [x] add light sp-only mpu version (Jeff Rasley)
- [x] improved debug
- [x] added `all_gather_object` to `dist`
- [x] `UlyssesSPAttentionHF` (port of UlyssesAttention from
Megatron-Deepspeed plus modern MHA-variations)
- [x] `UlyssesSPDataLoaderAdapter` - DL adapter to shard the normal DL
batches to be used by `UlyssesSPAttentionHF`
- [x] `SequenceTiledCompute` - generic autograd function to perform
compute after tiling on the sequence dimension
- [x] `TiledMLP` - a specific autograd function to perform tiled MLP
(it's much easier to understand before trying to grok
`SequenceTiledCompute`)
- [x] added a differentiable `_DimZeroAllToAll` (Samyam Rajbhandari)
- [x] torch-dist-check now allows `torch.distributed.nn` (which is
needed since deepspeed's dist is not up to date with
`torch.distributed.nn`)

---------

Signed-off-by: Stas Bekman <stas.bekman@snowflake.com>
Signed-off-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas.bekman@snowflake.com>
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Co-authored-by: Olatunji Ruwase <tunji.ruwase@snowflake.com>
2025-05-31 07:25:23 +00:00
e290bf580d disable license check until the new license situation has been sorted… (#7301)
Until we sort out the new license situation disable this check so that
new code not owned by MSFT could be added

---------

Signed-off-by: Stas Bekman <stas@stason.org>
2025-05-22 00:27:39 +00:00
930ab46e63 Fix issues XPU tests hit with extra-index-url (#7291)
cc: @Liangliang-Ma

---------

Signed-off-by: Logan Adams <loadams@microsoft.com>
2025-05-16 19:07:35 -07:00
5115df3f0b Add script to check for --extra-index-url (#5184)
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
2024-02-26 20:37:32 +00:00
38b41dffa1 DeepSpeed-FastGen (#4604)
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com>
Co-authored-by: Masahiro Tanaka <mtanaka@microsoft.com>
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
2023-11-03 15:07:35 -07:00
2c67b58b5f fix: check-license (#4432)
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
2023-10-03 10:27:38 -07:00
807d1b5dfc scripts/check-torchcuda.py: add checking for tensor.is_cuda (#3843)
.cpp files are excluded

Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
2023-06-30 15:06:36 -07:00
c8d3f5eb19 fix typo in comments with deepspeed/ (#3537)
* fix spelling error with deepspeed/runtime/

* fix typo docs/

* fix typo in comments with deepspeed/

---------

Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
2023-05-15 19:20:46 +00:00
1f85569e1c Fix copyright check, add copyright replace script (#3141)
* fix copyright script and add replace-copyright script
2023-04-06 17:09:23 -07:00
b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111)
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
2023-03-30 17:14:38 -07:00
91d63e0228 update formatter version and style settings (#3098) 2023-03-27 07:55:19 -04:00
090d49e79f pre-commit check for torch.cuda in code (#2981)
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
2023-03-23 20:29:54 -07:00
da84e60d98 add missing license info to top of all source code (#2889)
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
Co-authored-by: Conglong Li <conglong.li@gmail.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
2023-02-27 11:20:41 -08:00
316c4a43e0 Add flake8 to pre-commit checks (#2051) 2022-07-25 16:48:08 -07:00
36ad3119d5 DeepSpeed comm backend v1 (#1985)
Co-authored-by: Quentin Anthony <qganthony@yahoo.com>
Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com>
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
2022-06-10 16:47:33 -07:00