|
67c14906aa
|
Update PyTorch to 2.8.0 (#20358)
Signed-off-by: Huy Do <huydhn@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
2025-08-29 18:57:35 +08:00 |
|
|
644d57d531
|
[Model] Add Ernie4.5 VL Model Support (#22514)
Signed-off-by: wangyafeng <wangyafeng@baidu.com>
|
2025-08-26 21:02:55 -07:00 |
|
|
6dab89b8ec
|
[Docs] Fix math rendering in docs (#23676)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-08-26 18:47:08 -07:00 |
|
|
379f828fba
|
[Docs] Reduce requirements for docs build (#23651)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-08-26 15:43:28 +00:00 |
|
|
39971db3aa
|
Frontend: Adding LM Format Enforcer support to V1 engine (#22564)
Signed-off-by: Noam Gat <noamgat@gmail.com>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-08-24 19:31:22 -07:00 |
|
|
fa78de9dc3
|
Quantization: support FP4 quantized models on AMD CDNA2/CDNA3 GPUs (#22527)
Signed-off-by: feng <fengli1702@gmail.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
2025-08-22 20:53:21 -06:00 |
|
|
613a23b57f
|
[Bugfix]: Installing dev environment due to pydantic incompatible version (#23353)
Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
|
2025-08-22 16:22:29 +00:00 |
|
|
51a215300b
|
[Fix] Bump triton version in rocm-build requirements (#21630)
Signed-off-by: Burkhard Ringlein <ngl@zurich.ibm.com>
|
2025-08-22 15:13:39 +00:00 |
|
|
8896eb72eb
|
[Deprecation] Remove prompt_token_ids arg fallback in LLM.generate and LLM.embed (#18800)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-08-22 10:56:57 +08:00 |
|
|
c32e6ad1f6
|
[Quantization] Bump Compressed Tensors Version (#23202)
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
2025-08-20 00:39:28 +00:00 |
|
|
d6a1a20973
|
[CI/Build] Update transformers to v4.55.2 (#23093)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-08-19 10:06:17 -07:00 |
|
|
78dba404ad
|
[Hardware][IBM Z]Enable v1 for s390x and s390x dockerfile fixes (#22725)
Signed-off-by: Nikhil Suryawanshi <suryawanshin74@gmail.com>
|
2025-08-19 04:40:37 +00:00 |
|
|
498259ccce
|
Install tpu_info==0.4.0 to fix core dump for TPU (#23135)
|
2025-08-18 16:23:33 -07:00 |
|
|
ac6eb49de3
|
fix: OpenAI SDK compat (ResponseTextConfig) (#23126)
Signed-off-by: breno.skuk <breno.skuk@hcompany.ai>
Signed-off-by: Breno Baldas Skuk <breno.skuk@hcompany.ai>
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
2025-08-18 15:22:59 -07:00 |
|
|
bf756321c7
|
[CI Bugfix] Pin openai<1.100 to unblock CI (#23118)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-08-18 12:14:01 -07:00 |
|
|
bc1d02ac85
|
[Docs] Add comprehensive CLI reference for all large vllm subcommands (#22601)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-08-11 00:13:33 -07:00 |
|
|
5a16fa614c
|
[Model] Gemma3n MM (#20495)
Signed-off-by: ShriKode <shrikode@gmail.com>
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Roger Wang <hey@rogerw.me>
Co-authored-by: ShriKode <shrikode@gmail.com>
Co-authored-by: Roger Wang <hey@rogerw.me>
|
2025-08-09 09:56:25 -07:00 |
|
|
8a0ffd6285
|
Remove mamba_ssm from vLLM requirements; install inside test container using --no-build-isolation (#22541)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
|
2025-08-08 23:05:32 -07:00 |
|
|
81c57f60a2
|
[XPU] upgrade torch 2.8 on for XPU (#22300)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2025-08-08 17:03:45 -07:00 |
|
|
904063907c
|
[Misc] fix openai version (#22485)
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
|
2025-08-08 01:12:54 -07:00 |
|
|
434d2f3f7a
|
[Docs] Add missing dependency for docs build (#22435)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-08-07 00:22:07 -07:00 |
|
|
ecbea55ca2
|
Update hf_xet pin to resolve hangs (#22356)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-08-06 20:31:41 -07:00 |
|
|
134a8ee8fd
|
[gpt-oss] Add openai-harmony as default dependency (#22332)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: LiuXiaoxuanPKU <lilyliupku@gmail.com>
Co-authored-by: simon-mo <xmo@berkeley.edu>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
Co-authored-by: Hongxia Yang <62075498+hongxiayang@users.noreply.github.com>
Co-authored-by: Minseok Lee <47620120+minseokl@users.noreply.github.com>
Co-authored-by: Yongye Zhu <zyy1102000@gmail.com>
|
2025-08-06 00:10:14 -07:00 |
|
|
796bae07c5
|
Update transformers to v4.55 (#21931)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-08-05 22:56:14 -07:00 |
|
|
dd16bdc798
|
Increase openai-python version (#22316)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-08-05 21:43:21 -07:00 |
|
|
2ff46b8826
|
[Misc] Bump ray to 2.48.0 (#22123)
Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
|
2025-08-02 19:42:00 -07:00 |
|
|
0f81b310db
|
[Misc] Remove upper bound in openai package version (#22060)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-08-01 02:11:40 -07:00 |
|
|
da31f6ad3d
|
Revert precompile wheel changes (#22055)
|
2025-08-01 08:26:24 +00:00 |
|
|
0bd409cf01
|
Move flashinfer-python to optional extra vllm[flashinfer] (#21959)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-07-31 18:02:11 -07:00 |
|
|
53c21e492e
|
Update torch_xla pin to 20250730 (#21956)
Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>
|
2025-07-31 17:26:43 +00:00 |
|
|
58bb902186
|
fix(setup): improve precompiled wheel setup for Docker builds (#22025)
Signed-off-by: dougbtv <dosmith@redhat.com>
|
2025-07-31 09:52:48 -07:00 |
|
|
88edf5994c
|
[Docs] Reduce the size of the built docs (#21920)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-07-30 07:35:08 -07:00 |
|
|
a33ea28b1b
|
Add flashinfer_python to CUDA wheel requirements (#21389)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-07-29 12:51:58 -07:00 |
|
|
31084b3b1f
|
[Bugfix][CI/Build] Update peft version in test requirement (#21729)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-07-28 06:17:43 -07:00 |
|
|
1395dd9c28
|
[Docs] Add revision date to rendered docs (#21752)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-07-28 06:12:46 -07:00 |
|
|
f1b286b2fb
|
[TPU] Update ptxla nightly version to 20250724 (#21555)
Signed-off-by: Chengji Yao <chengjiyao@google.com>
|
2025-07-25 17:09:00 -07:00 |
|
|
396ee94180
|
[CI] Unifying Dockerfiles for ARM and X86 Builds (#21343)
Signed-off-by: Kebe <mail@kebe7jun.com>
|
2025-07-25 07:33:56 -07:00 |
|
|
6066284914
|
[P/D] Support CPU Transfer in NixlConnector (#18293)
Signed-off-by: Juncheng Gu <juncgu@gmail.com>
Signed-off-by: Richard Liu <ricliu@google.com>
Co-authored-by: Richard Liu <39319471+richardsliu@users.noreply.github.com>
Co-authored-by: Richard Liu <ricliu@google.com>
|
2025-07-24 17:58:42 +01:00 |
|
|
5a19a6c670
|
[Fix] Update mamba_ssm to 2.2.5 (#21421)
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
|
2025-07-24 03:25:41 -07:00 |
|
|
6da0078523
|
[Feat] Allow custom naming of vLLM processes (#21445)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-07-24 03:15:23 -07:00 |
|
|
6d8d0a24c0
|
Add think chunk (#21333)
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
|
2025-07-23 21:51:32 -07:00 |
|
|
8560a5b258
|
[Core][Model] PrithviMAE Enablement on vLLM v1 engine (#20577)
Signed-off-by: Christian Pinto <christian.pinto@ibm.com>
|
2025-07-23 11:00:23 -07:00 |
|
|
e3a0e43d7f
|
[bugfix] Fix auto thread-binding when world_size > 1 in CPU backend and refactor code (#21032)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-07-19 05:13:55 -07:00 |
|
|
4de7146351
|
[V0 deprecation] Remove V0 HPU backend (#21131)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-07-17 16:37:36 -07:00 |
|
|
4ef00b5cac
|
[VLM] Add Nemotron-Nano-VL-8B-V1 support (#20349)
Signed-off-by: Kyle Huang <kylhuang@nvidia.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-07-17 03:07:55 -07:00 |
|
|
58760e12b1
|
[TPU] Start using python 3.12 (#21000)
Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>
|
2025-07-16 19:37:44 -07:00 |
|
|
4e7dfbe7b4
|
Update PyTorch to torch==2.7.1 for CUDA (#21011)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-07-17 02:30:44 +00:00 |
|
|
b5c3b68359
|
[Misc] bump xgrammar version to v0.1.21 (#20992)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-07-15 19:42:16 -07:00 |
|
|
b637e9dcb8
|
Add full serve CLI reference back to docs (#20978)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-07-15 17:42:30 +00:00 |
|
|
e7e3e6d263
|
Voxtral (#20970)
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-07-15 07:35:30 -07:00 |
|