A high-throughput and memory-efficient inference and serving engine for LLMs
Updated 2025-10-20 03:47:19 +08:00
Train transformer language models with reinforcement learning.
Updated 2025-10-20 01:27:03 +08:00
Community maintained hardware plugin for vLLM on Ascend
Updated 2025-10-19 17:06:05 +08:00
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Updated 2025-10-15 09:58:53 +08:00
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
Updated 2025-10-14 20:11:32 +08:00
A high-throughput and memory-efficient inference and serving engine for LLMs
Updated 2025-10-11 16:48:30 +08:00