mirror of
https://github.com/deepspeedai/DeepSpeed.git
synced 2025-10-20 15:33:51 +08:00
Reorganize MoE news and tutorials. (#1708)
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
This commit is contained in:
@ -13,16 +13,16 @@ Remove until pypi issue is resolved: https://status.python.org/incidents/2jj696s
|
||||
[](https://pepy.tech/project/deepspeed)
|
||||
-->
|
||||
## Latest News
|
||||
* [2021/12/09] [DeepSpeed-MoE for NLG: Reducing the training cost of language models by 5 times](https://www.deepspeed.ai/news/2021/12/09/deepspeed-moe-nlg.html)
|
||||
* [2021/08/18] [DeepSpeed powers 8x larger MoE model training with high performance](https://www.microsoft.com/en-us/research/blog/deepspeed-powers-8x-larger-moe-model-training-with-high-performance/)
|
||||
* [Mixture of Experts (MoE) tutorial](https://www.deepspeed.ai/tutorials/mixture-of-experts/).
|
||||
* [2022/1/19] [DeepSpeed: Advancing MoE inference and training to power next-generation AI scale](https://www.microsoft.com/en-us/research/blog/deepspeed-advancing-moe-inference-and-training-to-power-next-generation-ai-scale/)
|
||||
* [Mixture of Experts (MoE) for NLG tutorial](https://www.deepspeed.ai/tutorials/mixture-of-experts-nlg/).
|
||||
* [Mixture of Experts (MoE) Inference tutorial](https://www.deepspeed.ai/tutorials/moe-inference-tutorial).
|
||||
* [2021/11/15] [Autotuning: Automatically discover the optimal DeepSpeed configuration that delivers good training speed](https://www.deepspeed.ai/news/2021/11/15/autotuning.html)
|
||||
* [2021/10/11] [Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest and Most Powerful Generative Language Model](https://www.microsoft.com/en-us/research/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/)
|
||||
* Read more on how to [train large models with DeepSpeed](https://www.deepspeed.ai/tutorials/large-models-w-deepspeed/)
|
||||
* [2021/08/18] [DeepSpeed powers 8x larger MoE model training with high performance](https://www.microsoft.com/en-us/research/blog/deepspeed-powers-8x-larger-moe-model-training-with-high-performance/)
|
||||
* [2021/08/16] [Curriculum learning: a regularization method for stable and 3.3x faster GPT-2 pre-training with 8x/4x larger batch size/learning rate](https://www.deepspeed.ai/tutorials/curriculum-learning/)
|
||||
|
||||
### DeepSpeed is hiring, [come join us!](https://careers.microsoft.com/us/en/search-results?keywords=deepspeed%20open%20source)
|
||||
### DeepSpeed is hiring, [come join us!](https://careers.microsoft.com/us/en/search-results?keywords=http:%2F%2Fdeepspeed.ai)
|
||||
---
|
||||
|
||||
[DeepSpeed](https://www.deepspeed.ai/) is a deep learning optimization
|
||||
|
@ -91,4 +91,4 @@ breadcrumbs: true
|
||||
|
||||
press_release_v3: https://www.microsoft.com/en-us/research/blog/deepspeed-extreme-scale-model-training-for-everyone/
|
||||
press_release_v5: https://www.microsoft.com/en-us/research/blog/deepspeed-powers-8x-larger-moe-model-training-with-high-performance/
|
||||
press_release_v6: https://www.microsoft.com
|
||||
press_release_v6: https://www.microsoft.com/en-us/research/blog/deepspeed-advancing-moe-inference-and-training-to-power-next-generation-ai-scale/
|
||||
|
9
docs/_posts/2022-01-19-moe-inference.md
Normal file
9
docs/_posts/2022-01-19-moe-inference.md
Normal file
@ -0,0 +1,9 @@
|
||||
---
|
||||
layout: single
|
||||
title: "DeepSpeed: Advancing MoE inference and training to power next-generation AI scale"
|
||||
excerpt: ""
|
||||
categories: news
|
||||
link: https://www.microsoft.com/en-us/research/blog/deepspeed-advancing-moe-inference-and-training-to-power-next-generation-ai-scale/
|
||||
new_post: true
|
||||
date: 2022-01-19 00:00:00
|
||||
---
|
@ -6,7 +6,7 @@ toc_label: "Contents"
|
||||
|
||||
<b> DeepSpeed+Megatron trained the world's most powerful language model: [MT-530B](https://www.microsoft.com/en-us/research/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/) <b>
|
||||
|
||||
<b> DeepSpeed is hiring, [come join us!](https://careers.microsoft.com/us/en/search-results?keywords=deepspeed%20open%20source) </b>
|
||||
<b> DeepSpeed is hiring, [come join us!](https://careers.microsoft.com/us/en/search-results?keywords=http:%2F%2Fdeepspeed.ai) </b>
|
||||
|
||||
DeepSpeed is a deep learning optimization library that makes distributed training easy,
|
||||
efficient, and effective.
|
||||
@ -32,7 +32,9 @@ initiative to enable next-generation AI capabilities at scale, where you can fin
|
||||
information [here](https://innovation.microsoft.com/en-us/exploring-ai-at-scale).
|
||||
|
||||
# What's New?
|
||||
* [2021/12/09] [DeepSpeed-MoE for NLG: Reducing the training cost of language models by 5 times](https://www.deepspeed.ai/news/2021/12/09/deepspeed-moe-nlg.html)
|
||||
* [2022/1/19] [DeepSpeed: Advancing MoE inference and training to power next-generation AI scale](https://www.microsoft.com/en-us/research/blog/deepspeed-advancing-moe-inference-and-training-to-power-next-generation-ai-scale/)
|
||||
* [Mixture of Experts (MoE) for NLG tutorial](https://www.deepspeed.ai/tutorials/mixture-of-experts-nlg/).
|
||||
* [Mixture of Experts (MoE) Inference tutorial](https://www.deepspeed.ai/tutorials/moe-inference-tutorial).
|
||||
* [2021/11/15] [Autotuning: Automatically discover the optimal DeepSpeed configuration that delivers good training speed](https://www.deepspeed.ai/news/2021/11/15/autotuning.html)
|
||||
* [2021/10/11] [Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest and Most Powerful Generative Language Model](https://www.microsoft.com/en-us/research/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/)
|
||||
* Read more on how to [train large models with DeepSpeed](https://www.deepspeed.ai/tutorials/large-models-w-deepspeed/)
|
||||
|
Reference in New Issue
Block a user