mirror of
https://github.com/deepspeedai/DeepSpeed.git
synced 2025-10-20 15:33:51 +08:00
This PR introduces *DeepCompile*, a new feature that efficiently integrates compiler optimizations with other DeepSpeed features. DeepCompile utilizes torch's dynamo to capture the computation graph and modifies it to incorporate DeepSpeed’s optimizations seamlessly. Currently, DeepCompile supports ZeRO-1 and ZeRO-3, with enhancements such as proactive prefetching and selective unsharding to improve performance. (More details will be added later.) --------- Signed-off-by: Masahiro Tanaka <mtanaka@microsoft.com> Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: zafarsadiq <zafarsadiq120@gmail.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
18 lines
319 B
Batchfile
18 lines
319 B
Batchfile
@echo off
|
|
|
|
set CUDA_HOME=%CUDA_PATH%
|
|
set DISTUTILS_USE_SDK=1
|
|
|
|
set DS_BUILD_AIO=0
|
|
set DS_BUILD_CUTLASS_OPS=0
|
|
set DS_BUILD_EVOFORMER_ATTN=0
|
|
set DS_BUILD_FP_QUANTIZER=0
|
|
set DS_BUILD_GDS=0
|
|
set DS_BUILD_RAGGED_DEVICE_OPS=0
|
|
set DS_BUILD_SPARSE_ATTN=0
|
|
set DS_BUILD_DEEP_COMPILE=0
|
|
|
|
python -m build --wheel --no-isolation
|
|
|
|
:end
|