A high-throughput and memory-efficient inference and serving engine for LLMs
amd
blackwell
cuda
deepseek
deepseek-v3
gpt
gpt-oss
inference
kimi
llama
llm
llm-serving
model-serving
moe
openai
pytorch
qwen
qwen3
tpu
transformer
Updated 2025-10-20 03:47:19 +08:00
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Updated 2025-10-20 03:20:45 +08:00
Train transformer language models with reinforcement learning.
Updated 2025-10-20 01:27:03 +08:00
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
bert
deep-learning
flax
hacktoberfest
jax
language-model
language-models
machine-learning
model-hub
natural-language-processing
nlp
nlp-library
pretrained-models
python
pytorch
pytorch-transformers
seq2seq
speech-recognition
tensorflow
transformer
Updated 2025-10-19 19:36:25 +08:00
Community maintained hardware plugin for vLLM on Ascend
Updated 2025-10-19 17:06:05 +08:00
Updated 2025-10-19 15:16:44 +08:00
oneAPI Deep Neural Network Library (oneDNN)
aarch64
amx
avx512
bfloat16
cpp
deep-learning
deep-neural-networks
library
oneapi
onednn
openmp
performance
sycl
tbb
vnni
x64
x86-64
xe-architecture
Updated 2025-10-19 13:16:23 +08:00
verl: Volcano Engine Reinforcement Learning for LLMs
Updated 2025-10-19 08:50:34 +08:00
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
agent
ai
chatglm
fine-tuning
gpt
instruction-tuning
language-model
large-language-models
llama
llama3
llm
lora
mistral
moe
peft
qlora
quantization
qwen
rlhf
transformers
Updated 2025-10-18 18:02:14 +08:00
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Updated 2025-10-18 11:58:07 +08:00
Load compute kernels from the Hub
Updated 2025-10-17 23:26:19 +08:00
AlphaFold 3 inference pipeline.
Updated 2025-10-17 23:06:15 +08:00
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Updated 2025-10-17 22:24:46 +08:00
Official git repository for Biopython (originally converted from CVS)
bioinformatics
biopython
dna
genomics
phylogenetics
protein
protein-structure
python
sequence-alignment
Updated 2025-10-15 21:49:15 +08:00
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
billion-parameters
compression
data-parallelism
deep-learning
gpu
inference
machine-learning
mixture-of-experts
model-parallelism
pipeline-parallelism
pytorch
trillion-parameters
zero
Updated 2025-10-15 09:58:53 +08:00
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
Updated 2025-10-14 20:11:32 +08:00
Updated 2025-10-13 12:55:18 +08:00
A high-throughput and memory-efficient inference and serving engine for LLMs
amd
cuda
deepseek
gpt
hpu
inference
inferentia
llama
llm
llm-serving
llmops
mlops
model-serving
pytorch
qwen
rocm
tpu
trainium
transformer
xpu
Updated 2025-10-11 16:48:30 +08:00
Updated 2025-10-09 16:04:32 +08:00