DeepSpeed

mirror of https://github.com/deepspeedai/DeepSpeed.git synced 2025-10-20 23:53:48 +08:00

Author	SHA1	Message	Date
Olatunji Ruwase	79caae1c04	Update email address (#7624 ) Update contact address Signed-off-by: Olatunji Ruwase <tunji.ruwase@snowflake.com>	2025-10-07 17:15:49 +00:00
gjj2828	da60a878ac	fix: fix FileNotFoundError for build_win.bat (#7399 ) fix FileNotFoundError for build_win.bat `FileNotFoundError: [WinError 2] 系统找不到指定的文件。ERROR Backend subprocess exited when trying to invoke get_requires_for_build_wheel` Signed-off-by: gjj2828 <gjj2828@sina.com> Co-authored-by: gjj2828 <gjj2828@gmail.com>	2025-07-01 16:13:44 +00:00
Masahiro Tanaka	227a60c0c4	DeepCompile for enhanced compiler integration (#7154 ) This PR introduces DeepCompile, a new feature that efficiently integrates compiler optimizations with other DeepSpeed features. DeepCompile utilizes torch's dynamo to capture the computation graph and modifies it to incorporate DeepSpeed’s optimizations seamlessly. Currently, DeepCompile supports ZeRO-1 and ZeRO-3, with enhancements such as proactive prefetching and selective unsharding to improve performance. (More details will be added later.) --------- Signed-off-by: Masahiro Tanaka <mtanaka@microsoft.com> Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: zafarsadiq <zafarsadiq120@gmail.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>	2025-04-16 04:33:53 +00:00
Logan Adams	1d30b58cba	Replace calls to `python setup.py sdist` with `python -m build --sdist` (#7069 ) With future changes coming to pip/python/etc, we need to modify to no longer call `python setup.py ...` and replace it instead: https://packaging.python.org/en/latest/guides/modernize-setup-py-project/#should-setup-py-be-deleted ![image](https://github.com/user-attachments/assets/ea39ef7b-3cbe-4916-86f0-bc46a5fce96d) This means we need to install the build package which is added here as well. Additionally, we pass the `--sdist` flag to only build the sdist rather than the wheel as well here. --------- Signed-off-by: Logan Adams <loadams@microsoft.com>	2025-02-24 20:40:24 +00:00
Logan Adams	7288e6198e	Update setup.py handling of ROCm cupy (#7051 )	2025-02-18 16:51:34 -08:00
Olatunji Ruwase	fd40516923	Update GH org references (#6998 ) Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com> Signed-off-by: Fabien Dupont <fdupont@redhat.com> Co-authored-by: Fabien Dupont <fabiendupont@fabiendupont.fr>	2025-02-05 00:56:50 +00:00
Logan Adams	8bb4d442ad	Update recommended Windows whl building versions (#6983 )	2025-01-30 04:53:16 +00:00
Logan Adams	6628127a37	Update python version classifiers (#6933 ) Update python version classifiers in setup.py to reflect python versions currently supported.	2025-01-08 18:43:06 +00:00
Nadav Elyahu	065398d5de	Fix setup.py bash cmd generation to correctly extract git info (#6762 ) Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>	2024-11-19 13:54:53 -08:00
Olatunji Ruwase	a5400974df	DeepNVMe perf tuning (#6560 ) Add performance tuning utilities: `ds_nvme_tune` and `ds_io`. Update tutorial with tuning section. --------- Co-authored-by: Ubuntu <jomayeri@microsoft.com> Co-authored-by: Joe Mayer <114769929+jomayeri@users.noreply.github.com>	2024-09-26 13:07:19 +00:00
Olatunji Ruwase	659f6be105	Avoid security issues of subprocess shell (#6498 ) Avoid security issues of `shell=True` in subprocess --------- Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>	2024-09-11 20:07:06 +00:00
Olatunji Ruwase	662a421b05	Safe usage of popen (#6490 ) Avoid shell=True security issues with Popen	2024-09-04 21:06:04 +00:00
Olatunji Ruwase	5d1a30c033	DS_BUILD_OPS should build only compatible ops (#6489 ) Currently DS_BUILD_OPS=1 fails on incompatible ops. This is a deviation from [documentation](https://www.deepspeed.ai/tutorials/advanced-install/#pre-install-deepspeed-ops) which states that only compatible ops are built. <img width="614" alt="image" src="https://github.com/user-attachments/assets/0f1a184e-b568-4d25-9e9b-e394fb047df2">	2024-09-04 20:30:56 +00:00
Rohan Potdar	30428d0318	move pynvml install to setup.py (#5840 ) Only install pynvml on nvidia gpus; not all accelerators	2024-08-15 16:27:10 +00:00
andyG	3c490f9cf4	Use accelerator to replace cuda in setup and runner (#5769 ) Use accelerator apis to select device in setup.py and set visible devices env in runner.py --------- Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>	2024-08-01 13:28:55 -07:00
Costin Eseanu	74f3dcab62	Add Windows scripts (deepspeed, ds_report). (#5699 ) Co-authored-by: Costin Eseanu <costineseanu@gmail.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>	2024-07-09 01:05:09 +00:00
Costin Eseanu	b3767d01d4	Fixed Windows inference build. (#5609 ) Fix #2427 --------- Co-authored-by: Costin Eseanu <costineseanu@gmail.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>	2024-06-24 13:39:18 -07:00
Costin Eseanu	e7dd28a23d	Fixed the Windows build. (#5596 ) Fixed the Windows build. Fixes applied: - Remove some more ops that don't build on Windows. - Remove the use of symlinks that didn't work correctly and replace with `shutil.copytree()`. - Small fixes to make the C++ code compile. Tested with Python 3.9 and CUDA 12.1. --------- Co-authored-by: Costin Eseanu <costineseanu@gmail.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>	2024-05-31 22:11:10 +00:00
Antônio Vieira	059bb2085c	fix: swapping order of parameters in create_dir_symlink method. (#5465 ) Order of parameters in create_dir_symlink method looks wrong. Because this we get the error "PermissionError: [WinError 5] Denied access: '.\\deepspeed\\ops\\csrc'" when install deepspeed >= 0.4.0 on Windows enviroment. Please check this out @eltonzheng and @jeffra. --------- Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>	2024-04-29 17:37:54 +00:00
Ma, Guokai	c08e69f212	Make op builder detection adapt to accelerator change (#5206 ) This is an WIP PR that make op builder detection adapt to accelerator change. This is followup of https://github.com/microsoft/DeepSpeed/issues/5173 Currently, DeepSpeed generate `installed_ops` and `compatible_ops` at setup time. If the system change to a different accelerator at DeepSpeed launch time, these two list would contain incorrect information. This PR intend to solve this problem with more flexity ops detection. * For `installed_ops`, DeepSpeed should disable all installed ops if accelerator detected at setup time is different from launch time. * For `compatible_ops`, DeepSpeed should refresh the list for each launch to avoid impact of accelerator change. In the first step, nv-inference workflow is temporary change to emulate the scenario that the system is setup with CPU_Accelerator, then launch with CUDA_Accelerator. And CPU_Accelerator is modified to make Intel Extension for PyTorch and oneCCL binding for PyTorch not mandatory. Starting from here we can reconstruct installed_ops and compatible_ops to follow the design above. --------- Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>	2024-03-12 20:48:29 +00:00
Michael Wyatt	48749411b8	Disable ninja by default (#5194 ) #5192 reports an issue with the latest DeepSpeed release (0.13.3) related to pre-compilation and the recently re-enabled `ninja` support in #5088. Reverting to disabling `ninja` by default, but users can still enable it with `DS_ENABLE_NINJA=1` until we can further debug to understand the problem.	2024-02-26 11:41:09 -08:00
Jinzhen Lin	b00533e479	Use ninja to speed up build (#5088 ) Deepspeed have too many ops now, and it take too many time to pre-build all ops. I notice deepspeed disabled `ninja` 4 years ago (https://github.com/microsoft/DeepSpeed/pull/298) and I think we should consider enable it now. The issue mentioned in https://github.com/microsoft/DeepSpeed/pull/298 can be solved by resolving `include_dirs` to absolute path. --------- Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Logan Adams <loadams@microsoft.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>	2024-02-21 02:20:11 +00:00
Logan Adams	8fb111c08d	Treat empty environment variables as unset in (#4185 )	2023-08-21 22:32:31 +00:00
Yejing-Lai	7290aace9b	[CPU] Skip CPU support unimplemented error (#3633 ) * skip cpu support unimplemented error and update cpu inference workflow * add torch.bfloat16 to cuda_accelerator * remove UtilsBuilder skip * fused adam can build * use cpu adam to implement fused adam * enable zero stage 1 and 2 for synchronized accelerator (a.k.a. CPU) * remove unused parameters * remove skip FusedAdamBuilder; add suported_dtypes * fix format * Revert "fix format" Revert "remove skip FusedAdamBuilder; add suported_dtypes" Revert "remove unused parameters" Revert "enable zero stage 1 and 2 for synchronized accelerator (a.k.a. CPU)" Revert "use cpu adam to implement fused adam" Revert "fused adam can build" --------- Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Ma, Guokai <guokai.ma@intel.com>	2023-07-19 19:58:38 +00:00
stephen youn	69d1b9f978	DeepSpeed-Triton for Inference (#3748 ) Co-authored-by: Stephen Youn <styoun@microsoft.com> Co-authored-by: Arash Bakhtiari <arash@bakhtiari.org> Co-authored-by: Cheng Li <pistasable@gmail.com> Co-authored-by: Ethan Doe <yidoe@microsoft.com> Co-authored-by: yidoe <68296935+yidoe@users.noreply.github.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2023-06-23 14:30:49 -07:00
Tian, Feng	6938c449de	Add snip_momentum structured pruning which can support higher sparse ratio with minor accuracy loss (#3300 ) Signed-off-by: Tian, Feng <feng.tian@intel.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>	2023-05-10 10:33:48 -07:00
Pablo Emídio S.S	f3f4c44959	Build: Update license in setup (#3484 )	2023-05-08 17:16:05 +00:00
Michael Wyatt	bcccee4d85	Fix cupy install version detection (#3276 ) * updated cupy install * do non-isolated pip install * Update action.yml	2023-04-18 17:13:35 +00:00
Gavin Goodship	adc15e1c17	Update curriculum-learning.md (#3031 ) Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>	2023-04-07 08:51:03 -07:00
Michael Wyatt	b361c72761	Update DeepSpeed copyright license to Apache 2.0 (#3111 ) Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2023-03-30 17:14:38 -07:00
Jeff Rasley	91d63e0228	update formatter version and style settings (#3098 )	2023-03-27 07:55:19 -04:00
Jeff Rasley	bbfd0a6a3e	update email info	2023-03-15 14:16:26 -07:00
Logan Adams	b4d40e357b	Fix example command when building wheel with dev version specified (#2815 )	2023-02-21 18:16:35 +00:00
Jeff Rasley	0b549ad70a	[install] only add deepspeed pkg at install (#2714 ) Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>	2023-01-18 10:26:26 -08:00
Jeff Rasley	cd271a4aa6	exclude benchmarks during install (#2698 )	2023-01-13 14:24:30 -08:00
Ma, Guokai	9548d48f48	Abstract accelerator (step 2) (#2560 ) * Abstract accelerator (step 2) * more flex op_builder path for both installation and runtime * add SpatialInferenceBuilder into cuda_accelerator.py * use reflection to make cuda_accelerator adapt to CUDA op builder change automatically * clean up deepspeed/__init__.py * add comments in cuda_accelerator for no torch path * Update deepspeed/env_report.py Change env_report.py according to suggestion Co-authored-by: Michael Wyatt <mrwyattii@gmail.com> * reduce the range of try...except for better code clarity * Add porting for deepspeed/ops/random_ltd/dropping_utils.py * move accelerator to top directory and create symlink under deepspeed Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Michael Wyatt <mrwyattii@gmail.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2023-01-06 23:40:58 -05:00
Jeff Rasley	35eabb0a33	Fix issues w. python 3.6 + add py-version checks to CI (#2589 )	2022-12-09 21:53:58 +00:00
Michael Wyatt	521d329b97	Fix CI issues related to cupy install (#2483 ) * remove any cupy install when setting up environments * revert previous changes to run on cu111 runners * fix for when no cupy is installed * remove cupy uninstall for workflows not using latest torch version * update to cu116 for inference tests * fix pip uninstall line * move python environment list to after DS install * remove cupy uninstall * re-add --forked * fix how we get cupy version (should be based on nvcc version)	2022-11-08 10:17:03 -08:00
eltonzheng	b85eb3b979	Fix build issues on Windows (#2428 ) * Fix build issues on Windows * small fix to complie with new version of Microsoft C++ Build Tools Co-authored-by: Reza Yazdani <reyazda@microsoft.com> Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>	2022-10-26 00:14:43 +00:00
Jeff Rasley	1b7c6791d5	only add deps if extra is explictly called (#2432 )	2022-10-18 13:57:02 -07:00
Alex Hedges	316c4a43e0	Add flake8 to pre-commit checks (#2051 )	2022-07-25 16:48:08 -07:00
Alex Hedges	3540ce74d9	Check for bf16 support only if CUDA is available (#2049 ) Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>	2022-07-06 17:17:31 -06:00
Quentin Anthony	9b70ce56e7	Comms Benchmarks (#2040 ) Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2022-06-29 10:49:20 -07:00
Jeff Rasley	7c3344e215	DeepSpeed examples refresh (#2021 )	2022-06-15 18:46:30 -07:00
Jeff Rasley	b666d5cd73	[inference] test suite for ds-kernels (bert, roberta, gpt2, gpt-neo, gpt-j) (#1992 ) Co-authored-by: Reza Yazdani <reyazda@microsoft.com> Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>	2022-06-15 14:21:19 -07:00
Michael Wyatt	7fc3065074	Add torch-latest and torch-nightly CI workflows (#1990 ) Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2022-06-06 16:19:00 -07:00
Jithun Nair	350d74ca39	Invoke hipify from op builder for JIT extension builds (#1807 ) Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2022-03-07 18:59:14 +00:00
Andrii Oriekhov	d7684f4e81	add GitHub URL for PyPi (#1812 ) * add GitHub URL for PyPi * add GitHub URL for PyPi fix formatting	2022-03-06 04:42:03 +00:00
Jeff Rasley	c3c8d5dd93	AMD support (#1430 ) Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Jithun Nair <jithun.nair@amd.com> Co-authored-by: rraminen <rraminen@amd.com> Co-authored-by: Jeff Daily <jeff.daily@amd.com> Co-authored-by: okakarpa <okakarpa@amd.com> Co-authored-by: rraminen <rraminen@amd.com> Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com> Co-authored-by: Jeff Daily <jeff.daily@amd.com> Co-authored-by: okakarpa <okakarpa@amd.com> Co-authored-by: Ramya Ramineni <62723901+rraminen@users.noreply.github.com>	2022-03-03 01:53:35 +00:00
Jeff Rasley	9351266f78	Multi-node save pid support + allow sparse-attn extra (#1728 )	2022-01-27 12:35:18 -08:00

1 2

82 Commits