Commit Graph

82 Commits

Author SHA1 Message Date
79caae1c04 Update email address (#7624)
Update contact address

Signed-off-by: Olatunji Ruwase <tunji.ruwase@snowflake.com>
2025-10-07 17:15:49 +00:00
da60a878ac fix: fix FileNotFoundError for build_win.bat (#7399)
fix FileNotFoundError for build_win.bat

`FileNotFoundError: [WinError 2] 系统找不到指定的文件。ERROR Backend subprocess
exited when trying to invoke get_requires_for_build_wheel`

Signed-off-by: gjj2828 <gjj2828@sina.com>
Co-authored-by: gjj2828 <gjj2828@gmail.com>
2025-07-01 16:13:44 +00:00
227a60c0c4 DeepCompile for enhanced compiler integration (#7154)
This PR introduces *DeepCompile*, a new feature that efficiently
integrates compiler optimizations with other DeepSpeed features.
DeepCompile utilizes torch's dynamo to capture the computation graph and
modifies it to incorporate DeepSpeed’s optimizations seamlessly.

Currently, DeepCompile supports ZeRO-1 and ZeRO-3, with enhancements
such as proactive prefetching and selective unsharding to improve
performance.
(More details will be added later.)

---------

Signed-off-by: Masahiro Tanaka <mtanaka@microsoft.com>
Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: zafarsadiq <zafarsadiq120@gmail.com>
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
2025-04-16 04:33:53 +00:00
1d30b58cba Replace calls to python setup.py sdist with python -m build --sdist (#7069)
With future changes coming to pip/python/etc, we need to modify to no
longer call `python setup.py ...` and replace it instead:
https://packaging.python.org/en/latest/guides/modernize-setup-py-project/#should-setup-py-be-deleted


![image](https://github.com/user-attachments/assets/ea39ef7b-3cbe-4916-86f0-bc46a5fce96d)

This means we need to install the build package which is added here as
well.

Additionally, we pass the `--sdist` flag to only build the sdist rather
than the wheel as well here.

---------

Signed-off-by: Logan Adams <loadams@microsoft.com>
2025-02-24 20:40:24 +00:00
7288e6198e Update setup.py handling of ROCm cupy (#7051) 2025-02-18 16:51:34 -08:00
fd40516923 Update GH org references (#6998)
Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com>
Signed-off-by: Logan Adams <loadams@microsoft.com>
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Co-authored-by: Fabien Dupont <fabiendupont@fabiendupont.fr>
2025-02-05 00:56:50 +00:00
8bb4d442ad Update recommended Windows whl building versions (#6983) 2025-01-30 04:53:16 +00:00
6628127a37 Update python version classifiers (#6933)
Update python version classifiers in setup.py to reflect python versions
currently supported.
2025-01-08 18:43:06 +00:00
065398d5de Fix setup.py bash cmd generation to correctly extract git info (#6762)
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
2024-11-19 13:54:53 -08:00
a5400974df DeepNVMe perf tuning (#6560)
Add performance tuning utilities: `ds_nvme_tune` and `ds_io`.  
Update tutorial with tuning section.

---------

Co-authored-by: Ubuntu <jomayeri@microsoft.com>
Co-authored-by: Joe Mayer <114769929+jomayeri@users.noreply.github.com>
2024-09-26 13:07:19 +00:00
659f6be105 Avoid security issues of subprocess shell (#6498)
Avoid security issues of `shell=True` in subprocess

---------

Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
2024-09-11 20:07:06 +00:00
662a421b05 Safe usage of popen (#6490)
Avoid shell=True security issues with Popen
2024-09-04 21:06:04 +00:00
5d1a30c033 DS_BUILD_OPS should build only compatible ops (#6489)
Currently DS_BUILD_OPS=1 fails on incompatible ops. This is a deviation
from
[documentation](https://www.deepspeed.ai/tutorials/advanced-install/#pre-install-deepspeed-ops)
which states that only compatible ops are built.

<img width="614" alt="image"
src="https://github.com/user-attachments/assets/0f1a184e-b568-4d25-9e9b-e394fb047df2">
2024-09-04 20:30:56 +00:00
30428d0318 move pynvml install to setup.py (#5840)
Only install pynvml on nvidia gpus; not all accelerators
2024-08-15 16:27:10 +00:00
3c490f9cf4 Use accelerator to replace cuda in setup and runner (#5769)
Use accelerator apis to select device in setup.py and set visible
devices env in runner.py

---------

Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
2024-08-01 13:28:55 -07:00
74f3dcab62 Add Windows scripts (deepspeed, ds_report). (#5699)
Co-authored-by: Costin Eseanu <costineseanu@gmail.com>
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
2024-07-09 01:05:09 +00:00
b3767d01d4 Fixed Windows inference build. (#5609)
Fix #2427

---------

Co-authored-by: Costin Eseanu <costineseanu@gmail.com>
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
2024-06-24 13:39:18 -07:00
e7dd28a23d Fixed the Windows build. (#5596)
Fixed the Windows build.

Fixes applied:
- Remove some more ops that don't build on Windows.
- Remove the use of symlinks that didn't work correctly and replace with
`shutil.copytree()`.
- Small fixes to make the C++ code compile.

Tested with Python 3.9 and CUDA 12.1.

---------

Co-authored-by: Costin Eseanu <costineseanu@gmail.com>
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
2024-05-31 22:11:10 +00:00
059bb2085c fix: swapping order of parameters in create_dir_symlink method. (#5465)
Order of parameters in create_dir_symlink method looks wrong. Because
this we get the error "PermissionError: [WinError 5] Denied access:
'.\\deepspeed\\ops\\csrc'" when install deepspeed >= 0.4.0 on Windows
enviroment.

Please check this out @eltonzheng and @jeffra.

---------

Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
2024-04-29 17:37:54 +00:00
c08e69f212 Make op builder detection adapt to accelerator change (#5206)
This is an WIP PR that make op builder detection adapt to accelerator
change. This is followup of
https://github.com/microsoft/DeepSpeed/issues/5173
Currently, DeepSpeed generate `installed_ops` and `compatible_ops` at
setup time. If the system change to a different accelerator at DeepSpeed
launch time, these two list would contain incorrect information.

This PR intend to solve this problem with more flexity ops detection.

* For `installed_ops`, DeepSpeed should disable all installed ops if
accelerator detected at setup time is different from launch time.
* For `compatible_ops`, DeepSpeed should refresh the list for each
launch to avoid impact of accelerator change.

In the first step, nv-inference workflow is temporary change to emulate
the scenario that the system is setup with CPU_Accelerator, then launch
with CUDA_Accelerator. And CPU_Accelerator is modified to make Intel
Extension for PyTorch and oneCCL binding for PyTorch not mandatory.

Starting from here we can reconstruct installed_ops and compatible_ops
to follow the design above.

---------

Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
2024-03-12 20:48:29 +00:00
48749411b8 Disable ninja by default (#5194)
#5192 reports an issue with the latest DeepSpeed release (0.13.3)
related to pre-compilation and the recently re-enabled `ninja` support
in #5088. Reverting to disabling `ninja` by default, but users can still
enable it with `DS_ENABLE_NINJA=1` until we can further debug to
understand the problem.
2024-02-26 11:41:09 -08:00
b00533e479 Use ninja to speed up build (#5088)
Deepspeed have too many ops now, and it take too many time to pre-build
all ops.
I notice deepspeed disabled `ninja` 4 years ago
(https://github.com/microsoft/DeepSpeed/pull/298) and I think we should
consider enable it now.
The issue mentioned in https://github.com/microsoft/DeepSpeed/pull/298
can be solved by resolving `include_dirs` to absolute path.

---------

Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Co-authored-by: Logan Adams <loadams@microsoft.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
2024-02-21 02:20:11 +00:00
8fb111c08d Treat empty environment variables as unset in (#4185) 2023-08-21 22:32:31 +00:00
7290aace9b [CPU] Skip CPU support unimplemented error (#3633)
* skip cpu support unimplemented error and update cpu inference workflow

* add torch.bfloat16 to cuda_accelerator

* remove UtilsBuilder skip

* fused adam can build

* use cpu adam to implement fused adam

* enable zero stage 1 and 2 for synchronized accelerator (a.k.a. CPU)

* remove unused parameters

* remove skip FusedAdamBuilder; add suported_dtypes

* fix format

* Revert "fix format"

Revert "remove skip FusedAdamBuilder; add suported_dtypes"

Revert "remove unused parameters"

Revert "enable zero stage 1 and 2 for synchronized accelerator (a.k.a. CPU)"

Revert "use cpu adam to implement fused adam"

Revert "fused adam can build"

---------

Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Ma, Guokai <guokai.ma@intel.com>
2023-07-19 19:58:38 +00:00
69d1b9f978 DeepSpeed-Triton for Inference (#3748)
Co-authored-by: Stephen Youn <styoun@microsoft.com>
Co-authored-by: Arash Bakhtiari <arash@bakhtiari.org>
Co-authored-by: Cheng Li <pistasable@gmail.com>
Co-authored-by: Ethan Doe <yidoe@microsoft.com>
Co-authored-by: yidoe <68296935+yidoe@users.noreply.github.com>
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
2023-06-23 14:30:49 -07:00
6938c449de Add snip_momentum structured pruning which can support higher sparse ratio with minor accuracy loss (#3300)
Signed-off-by: Tian, Feng <feng.tian@intel.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
2023-05-10 10:33:48 -07:00
f3f4c44959 Build: Update license in setup (#3484) 2023-05-08 17:16:05 +00:00
bcccee4d85 Fix cupy install version detection (#3276)
* updated cupy install

* do non-isolated pip install

* Update action.yml
2023-04-18 17:13:35 +00:00
adc15e1c17 Update curriculum-learning.md (#3031)
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
2023-04-07 08:51:03 -07:00
b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111)
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
2023-03-30 17:14:38 -07:00
91d63e0228 update formatter version and style settings (#3098) 2023-03-27 07:55:19 -04:00
bbfd0a6a3e update email info 2023-03-15 14:16:26 -07:00
b4d40e357b Fix example command when building wheel with dev version specified (#2815) 2023-02-21 18:16:35 +00:00
0b549ad70a [install] only add deepspeed pkg at install (#2714)
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
2023-01-18 10:26:26 -08:00
cd271a4aa6 exclude benchmarks during install (#2698) 2023-01-13 14:24:30 -08:00
9548d48f48 Abstract accelerator (step 2) (#2560)
* Abstract accelerator (step 2)

* more flex op_builder path for both installation and runtime

* add SpatialInferenceBuilder into cuda_accelerator.py

* use reflection to make cuda_accelerator adapt to CUDA op builder change automatically

* clean up deepspeed/__init__.py

* add comments in cuda_accelerator for no torch path

* Update deepspeed/env_report.py

Change env_report.py according to suggestion

Co-authored-by: Michael Wyatt <mrwyattii@gmail.com>

* reduce the range of try...except for better code clarity

* Add porting for deepspeed/ops/random_ltd/dropping_utils.py

* move accelerator to top directory and create symlink under deepspeed

Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Michael Wyatt <mrwyattii@gmail.com>
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
2023-01-06 23:40:58 -05:00
35eabb0a33 Fix issues w. python 3.6 + add py-version checks to CI (#2589) 2022-12-09 21:53:58 +00:00
521d329b97 Fix CI issues related to cupy install (#2483)
* remove any cupy install when setting up environments

* revert previous changes to run on cu111 runners

* fix for when no cupy is installed

* remove cupy uninstall for workflows not using latest torch version

* update to cu116 for inference tests

* fix pip uninstall line

* move python environment list to after DS install

* remove cupy uninstall

* re-add --forked

* fix how we get cupy version (should be based on nvcc version)
2022-11-08 10:17:03 -08:00
b85eb3b979 Fix build issues on Windows (#2428)
* Fix build issues on Windows

* small fix to complie with new version of Microsoft C++ Build Tools

Co-authored-by: Reza Yazdani <reyazda@microsoft.com>
Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>
2022-10-26 00:14:43 +00:00
1b7c6791d5 only add deps if extra is explictly called (#2432) 2022-10-18 13:57:02 -07:00
316c4a43e0 Add flake8 to pre-commit checks (#2051) 2022-07-25 16:48:08 -07:00
3540ce74d9 Check for bf16 support only if CUDA is available (#2049)
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
2022-07-06 17:17:31 -06:00
9b70ce56e7 Comms Benchmarks (#2040)
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
2022-06-29 10:49:20 -07:00
7c3344e215 DeepSpeed examples refresh (#2021) 2022-06-15 18:46:30 -07:00
b666d5cd73 [inference] test suite for ds-kernels (bert, roberta, gpt2, gpt-neo, gpt-j) (#1992)
Co-authored-by: Reza Yazdani <reyazda@microsoft.com>
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>
2022-06-15 14:21:19 -07:00
7fc3065074 Add torch-latest and torch-nightly CI workflows (#1990)
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
2022-06-06 16:19:00 -07:00
350d74ca39 Invoke hipify from op builder for JIT extension builds (#1807)
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
2022-03-07 18:59:14 +00:00
d7684f4e81 add GitHub URL for PyPi (#1812)
* add GitHub URL for PyPi

* add GitHub URL for PyPi fix formatting
2022-03-06 04:42:03 +00:00
c3c8d5dd93 AMD support (#1430)
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Jithun Nair <jithun.nair@amd.com>
Co-authored-by: rraminen <rraminen@amd.com>
Co-authored-by: Jeff Daily <jeff.daily@amd.com>
Co-authored-by: okakarpa <okakarpa@amd.com>
Co-authored-by: rraminen <rraminen@amd.com>
Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>
Co-authored-by: Jeff Daily <jeff.daily@amd.com>
Co-authored-by: okakarpa <okakarpa@amd.com>
Co-authored-by: Ramya Ramineni <62723901+rraminen@users.noreply.github.com>
2022-03-03 01:53:35 +00:00
9351266f78 Multi-node save pid support + allow sparse-attn extra (#1728) 2022-01-27 12:35:18 -08:00