Commit Graph

20 Commits

Author SHA1 Message Date
4f803852ac Op_builder->is_compatible quite warning (#6093)
Set the default value of op_builder/xxx.py/is_compatible()/verbose to
False for quite warning.
Add verbose judgement before
op_builder/xxx.py/is_compatible()/self.warning(...).
Otherwise the verbose arg will not work.

---------

Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
2024-09-05 17:03:34 +00:00
0f2d485c27 Log operator warnings only in verbose mode (#5917) 2024-08-14 01:10:17 +00:00
0a61d5d664 Hybrid Engine Refactor and Llama Inference Support (#3425)
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
2023-05-03 17:20:07 -07:00
b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111)
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
2023-03-30 17:14:38 -07:00
91d63e0228 update formatter version and style settings (#3098) 2023-03-27 07:55:19 -04:00
da84e60d98 add missing license info to top of all source code (#2889)
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
Co-authored-by: Conglong Li <conglong.li@gmail.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
2023-02-27 11:20:41 -08:00
b841628207 Drop Maxwell Support (#2574)
* Officially drop Maxwell support

* Formatting

* Comparison mismatch fix
2022-12-06 10:42:32 -08:00
e7e7595502 Stable Diffusion Enhancements (#2491)
Co-authored-by: cmikeh2 <connorholmes@microsoft.com>
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Co-authored-by: Reza Yazdani <reyazda@microsoft.com>
2022-11-09 17:40:59 -08:00
c84bca37b1 Memory Access Utility (#2276)
Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com>
2022-09-01 12:49:51 -07:00
8b2a63717a Add support of OPT models (#2205)
* add opt replace policy

* simplify inf. api

* fix opt replace policy

* fix use-cash & add relu

* Add support of custom MLP act. function

* Revert "simplify inf. api"

This reverts commit 9e910fcbd5471dec9b3c92008426f5ba590bf0b6.

* fix the inference API (temp. solution)

* fix code formatting

* add unit tests for OPT models.

* refactor pre-attention layer norm configuration

* add support of opt-350m model

* refactor the HF model config initialization

* fix hf model config issue

Co-authored-by: Reza Yazdani <reyazda@microsoft.com>
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>
2022-08-15 07:31:51 -07:00
776e36988d delay torch import for inference compatability check (#2167) 2022-08-02 10:18:04 -07:00
46401b3884 [zero-3] shutdown zero.Init from within ds.init (#2150) 2022-07-29 17:24:46 -07:00
63f470eeb6 prevent cuda 10 builds of inference kernels on ampere (#2157) 2022-07-29 15:00:12 -07:00
8164ea9e6d Fixing several bugs in the inference-api and the kernels (#1951)
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
2022-05-24 13:27:50 -07:00
b4fcd98ff0 Inference PP changes for neox (#1899)
Co-authored-by: Reza Yazdani <reyazda@microsoft.com>
Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>
2022-04-26 11:50:38 -07:00
d3cad05105 fixing the inference build path when pre-building the inference op (#1755) 2022-02-11 06:56:53 +00:00
289c3f9ba4 GPT-J inference support (#1670)
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
2022-01-08 02:40:31 +00:00
da1fe2f82c Remove hard torch dependency at install (#1166) 2021-06-16 14:18:37 -07:00
71ecf7e625 Add Windows support in README, use c++17 on Windows to support latest VC & cuda build tool (#1151)
* Add Windows support in README, use c++17 on Windows to support latest vc build tool

* Add detailed cpp build tools version in README

Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
2021-06-09 17:24:43 -07:00
ed3de0c21b Quantization + inference release (#1091)
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Co-authored-by: eltonzheng <eltonz@microsoft.com>
Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>
Co-authored-by: Shaden Smith <Shaden.Smith@microsoft.com>
Co-authored-by: Reza Yazdani <reyazda@microsoft.com>
Co-authored-by: Shaden Smith <Shaden.Smith@microsoft.com>
Co-authored-by: Elton Zheng <eltonz@microsoft.com>
Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>
Co-authored-by: eltonzheng <eltonz@microsoft.com>
Co-authored-by: Arash Ashari <arashari@microsoft.com>
Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>
Co-authored-by: Shaden Smith <Shaden.Smith@microsoft.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Reza Yazdani <reyazda@microsoft.com>
Co-authored-by: niumanar <60243342+niumanar@users.noreply.github.com>
Co-authored-by: eltonzheng <eltonz@microsoft.com>
Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>
Co-authored-by: Shaden Smith <Shaden.Smith@microsoft.com>
Co-authored-by: Reza Yazdani <reyazda@microsoft.com>
Co-authored-by: Arash Ashari <arashari@microsoft.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: niumanar <60243342+niumanar@users.noreply.github.com>

Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Co-authored-by: eltonzheng <eltonz@microsoft.com>
Co-authored-by: Shaden Smith <Shaden.Smith@microsoft.com>
Co-authored-by: Arash Ashari <arashari@microsoft.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: niumanar <60243342+niumanar@users.noreply.github.com>
2021-05-24 01:10:39 -07:00