4f803852ac
Op_builder->is_compatible quite warning ( #6093 )
...
Set the default value of op_builder/xxx.py/is_compatible()/verbose to
False for quite warning.
Add verbose judgement before
op_builder/xxx.py/is_compatible()/self.warning(...).
Otherwise the verbose arg will not work.
---------
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com >
2024-09-05 17:03:34 +00:00
0f2d485c27
Log operator warnings only in verbose mode ( #5917 )
2024-08-14 01:10:17 +00:00
0a61d5d664
Hybrid Engine Refactor and Llama Inference Support ( #3425 )
...
Co-authored-by: Jeff Rasley <jerasley@microsoft.com >
2023-05-03 17:20:07 -07:00
b361c72761
Update DeepSpeed copyright license to Apache 2.0 ( #3111 )
...
Co-authored-by: Jeff Rasley <jerasley@microsoft.com >
2023-03-30 17:14:38 -07:00
91d63e0228
update formatter version and style settings ( #3098 )
2023-03-27 07:55:19 -04:00
da84e60d98
add missing license info to top of all source code ( #2889 )
...
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com >
Co-authored-by: Conglong Li <conglong.li@gmail.com >
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com >
2023-02-27 11:20:41 -08:00
b841628207
Drop Maxwell Support ( #2574 )
...
* Officially drop Maxwell support
* Formatting
* Comparison mismatch fix
2022-12-06 10:42:32 -08:00
e7e7595502
Stable Diffusion Enhancements ( #2491 )
...
Co-authored-by: cmikeh2 <connorholmes@microsoft.com >
Co-authored-by: Jeff Rasley <jerasley@microsoft.com >
Co-authored-by: Reza Yazdani <reyazda@microsoft.com >
2022-11-09 17:40:59 -08:00
c84bca37b1
Memory Access Utility ( #2276 )
...
Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com >
2022-09-01 12:49:51 -07:00
8b2a63717a
Add support of OPT models ( #2205 )
...
* add opt replace policy
* simplify inf. api
* fix opt replace policy
* fix use-cash & add relu
* Add support of custom MLP act. function
* Revert "simplify inf. api"
This reverts commit 9e910fcbd5471dec9b3c92008426f5ba590bf0b6.
* fix the inference API (temp. solution)
* fix code formatting
* add unit tests for OPT models.
* refactor pre-attention layer norm configuration
* add support of opt-350m model
* refactor the HF model config initialization
* fix hf model config issue
Co-authored-by: Reza Yazdani <reyazda@microsoft.com >
Co-authored-by: Jeff Rasley <jerasley@microsoft.com >
Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com >
2022-08-15 07:31:51 -07:00
776e36988d
delay torch import for inference compatability check ( #2167 )
2022-08-02 10:18:04 -07:00
46401b3884
[zero-3] shutdown zero.Init from within ds.init ( #2150 )
2022-07-29 17:24:46 -07:00
63f470eeb6
prevent cuda 10 builds of inference kernels on ampere ( #2157 )
2022-07-29 15:00:12 -07:00
8164ea9e6d
Fixing several bugs in the inference-api and the kernels ( #1951 )
...
Co-authored-by: Jeff Rasley <jerasley@microsoft.com >
2022-05-24 13:27:50 -07:00
b4fcd98ff0
Inference PP changes for neox ( #1899 )
...
Co-authored-by: Reza Yazdani <reyazda@microsoft.com >
Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com >
2022-04-26 11:50:38 -07:00
d3cad05105
fixing the inference build path when pre-building the inference op ( #1755 )
2022-02-11 06:56:53 +00:00
289c3f9ba4
GPT-J inference support ( #1670 )
...
Co-authored-by: Jeff Rasley <jerasley@microsoft.com >
2022-01-08 02:40:31 +00:00
da1fe2f82c
Remove hard torch dependency at install ( #1166 )
2021-06-16 14:18:37 -07:00
71ecf7e625
Add Windows support in README, use c++17 on Windows to support latest VC & cuda build tool ( #1151 )
...
* Add Windows support in README, use c++17 on Windows to support latest vc build tool
* Add detailed cpp build tools version in README
Co-authored-by: Jeff Rasley <jerasley@microsoft.com >
2021-06-09 17:24:43 -07:00
ed3de0c21b
Quantization + inference release ( #1091 )
...
Co-authored-by: Jeff Rasley <jerasley@microsoft.com >
Co-authored-by: eltonzheng <eltonz@microsoft.com >
Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com >
Co-authored-by: Shaden Smith <Shaden.Smith@microsoft.com >
Co-authored-by: Reza Yazdani <reyazda@microsoft.com >
Co-authored-by: Shaden Smith <Shaden.Smith@microsoft.com >
Co-authored-by: Elton Zheng <eltonz@microsoft.com >
Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com >
Co-authored-by: eltonzheng <eltonz@microsoft.com >
Co-authored-by: Arash Ashari <arashari@microsoft.com >
Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com >
Co-authored-by: Shaden Smith <Shaden.Smith@microsoft.com >
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com >
Co-authored-by: Reza Yazdani <reyazda@microsoft.com >
Co-authored-by: niumanar <60243342+niumanar@users.noreply.github.com >
Co-authored-by: eltonzheng <eltonz@microsoft.com >
Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com >
Co-authored-by: Shaden Smith <Shaden.Smith@microsoft.com >
Co-authored-by: Reza Yazdani <reyazda@microsoft.com >
Co-authored-by: Arash Ashari <arashari@microsoft.com >
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com >
Co-authored-by: niumanar <60243342+niumanar@users.noreply.github.com >
Co-authored-by: Jeff Rasley <jerasley@microsoft.com >
Co-authored-by: eltonzheng <eltonz@microsoft.com >
Co-authored-by: Shaden Smith <Shaden.Smith@microsoft.com >
Co-authored-by: Arash Ashari <arashari@microsoft.com >
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com >
Co-authored-by: niumanar <60243342+niumanar@users.noreply.github.com >
2021-05-24 01:10:39 -07:00