93 Commits

Author SHA1 Message Date
d31f7844f8 [Misc] Move utils to avoid conflicts with stdlib, and move tests (#27169)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-19 05:20:55 -07:00
8a297115e2 [Chore] Separate out hashing utilities from vllm.utils (#27151)
Signed-off-by: dongbo910220 <1275604947@qq.com>
2025-10-19 11:09:38 +08:00
30a33b92ee [Misc] Rev DeepEP (#27122)
Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
2025-10-18 14:54:29 +08:00
6c9fdbf725 [Docs] Replace rst style double-backtick with md single-backtick (#27091)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-17 02:47:34 -07:00
71557a5f7c [CI] Fix mypy for vllm/executor (#26845)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2025-10-15 01:23:33 -07:00
314285d4f2 [CI] Fix mypy for vllm/distributed (#26593)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-13 16:02:24 -04:00
7ef6052804 [CI/Build] Add tool to build vllm-tpu wheel (#19165)
Signed-off-by: mgoin <michael@neuralmagic.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-10-12 16:25:40 -06:00
8fcaaf6a16 Update Optional[x] -> x | None and Union[x, y] to x | y (#26633)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-12 09:51:31 -07:00
27ed39a347 [XPU] Upgrade NIXL to remove CUDA dependency (#26570)
Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>
2025-10-11 05:15:23 +00:00
eef921f45e AOT Compilation for torch.compile (Bundled) (#24274)
Signed-off-by: zhxchen17 <zhxchen17@fb.com>
2025-10-10 19:02:11 -04:00
e09d1753ec Remove Python 3.9 support ahead of PyTorch 2.9 in v0.11.1 (#26416)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-08 10:40:42 -07:00
c1b06fc182 [CI Failure] Fix pre-commit issue for install_nixl_from_source_ubuntu.py (#26424)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-10-08 07:55:43 -07:00
9fc983c707 [NIXL][non-cuda] Add install script for nixl with non-cuda ucx (#25959)
Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>
2025-10-08 14:19:53 +00:00
b893d661b1 Fix per file ruff ignores related to simplification (#26259)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-05 20:31:53 +00:00
d6953beb91 Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-05 07:06:22 -07:00
96ebcaa3ad [Misc] Make EP kernels install script support uv (#25785)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
2025-09-30 23:38:34 +00:00
6a113d9aed [V0 Deprecation] Remove vllm.worker and update according imports (#25901) 2025-09-29 23:26:11 +00:00
9360d34fa1 update to latest deepgemm for dsv3.2 (#25871)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-09-29 17:51:43 +08:00
92da847cf5 Add flashinfer-build.sh and register precompiled cu128 wheel in Dockerfile (#25782)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-09-26 18:54:09 -07:00
3d2c56b7a9 Make mypy behave like a proper pre-commit hook (#25313)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-09-22 12:23:45 +00:00
64c824cd78 Make pickle import check fast (#25379)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-22 04:08:25 -07:00
3ed1ec4af2 Fix validate-config pre-commit check (#25157)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-18 12:06:28 +00:00
3059b9cc6b [Doc] Add --force-overwrite option to generate_cmake_presets.py (#24375)
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
2025-09-16 18:45:29 -07:00
64ad551878 Removes source compilation of nixl dependency (#24874)
Signed-off-by: bbartels <benjamin@bartels.dev>
Signed-off-by: Benjamin Bartels <benjamin@bartels.dev>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Daniele <36171005+dtrifiro@users.noreply.github.com>
2025-09-17 01:33:18 +00:00
a5b84f1cbf [Core] Shared memory based object store for Multimodal data caching and IPC (#20452)
Signed-off-by: donglu <donglu@cohere.com>
2025-09-12 07:54:17 -07:00
1aa427fdc1 [Kernels] Add Flash Linear Attention Kernels (#24518)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-09-10 00:04:41 +08:00
02d411fdb2 [Doc]: fix typos in Python comments (#24115)
Signed-off-by: Didier Durand <durand.didier@gmail.com>
2025-09-02 21:14:07 -07:00
42dc59dbac Update release pipeline post PyTorch 2.8.0 update (#24073)
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Huy Do <huydhn@gmail.com>
2025-09-03 10:09:19 +08:00
dc1a53186d [Kernel] Update DeepGEMM to latest commit (#23915)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
2025-09-01 02:38:04 -07:00
906e461ed6 [CI Fix] Pin deepep and pplx tags in tools/ep_kernels/, gate multigpu tests (#23568)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-08-25 18:29:00 -07:00
f6818a92cb [UX] Move Dockerfile DeepGEMM install to tools/install_deepgemm.sh (#23360)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-08-22 20:52:50 -06:00
0313cf854d [PERF] PyTorch Symmetric Memory All-Reduce (#20759)
Signed-off-by: ilmarkov <imarkov@redhat.com>
Signed-off-by: ilmarkov <markovilya197@gmail.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: ilmarkov <imarkov@redhat.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2025-08-22 15:39:08 -06:00
01a08739e0 [misc] split engine_model into json file for nsys profile tool (#23117)
Signed-off-by: Grace Ho <grho@nvidia.com>
Signed-off-by: Grace Ho <146482179+gracehonv@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-08-19 15:44:53 +08:00
1f83e7d849 [misc] nsys profile output kernel classifier and visualizer (#22971)
Signed-off-by: Grace Ho <grho@nvidia.com>
2025-08-16 02:52:51 +00:00
ebf7605b0d [Misc] Move tensor schema tests (#22612)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-11 00:15:27 -07:00
59a0b8554b [bugfix] fix blackwell deepep installation (#22255) 2025-08-06 01:26:09 +08:00
f12d9256b3 [Misc] Use dracut on CentOS and skip clone if repo exists for EP kernel installation (#21635)
Signed-off-by: Ming Yang <minos.future@gmail.com>
2025-07-30 13:15:06 -07:00
ba5c5e5404 [Docs] Switch to better markdown linting pre-commit hook (#21851)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-29 19:45:08 -07:00
37efc63b64 [V0 deprecation] Guided decoding (#21347)
Signed-off-by: Reza Barazesh <rezabarazesh@meta.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-29 03:15:30 -07:00
82ec66f514 [V0 Deprecation] Remove Prompt Adapters (#20588)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-07-23 16:36:48 -07:00
dd572c0ab3 [V0 Deprecation] Remove V0 Spec Decode workers (#21152)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-07-18 21:47:50 -07:00
217937221b Elastic Expert Parallel Initial Support (#20775)
Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
2025-07-18 17:46:09 -07:00
1caca5a589 [Misc] Add SPDX-FileCopyrightText (#20428)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-07-04 07:40:42 +00:00
e81fbefe8a [Refactor] Refactor import utils (#20269)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2025-07-01 18:05:42 -07:00
c05596f1a3 [Perf] Validate @config in pre-commit instead of dynamically (#20200)
Signed-off-by: Lionel Villard <villard@us.ibm.com>
2025-07-01 05:10:28 -04:00
8b64c895c0 [CI] Sync test dependency with test.in for torch nightly (#19632)
Signed-off-by: Yang Wang <elainewy@meta.com>
Signed-off-by: Yida Wu <yidawu@alumni.cmu.edu>
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Concurrensee <yida.wu@amd.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
2025-06-26 20:55:25 -07:00
bf5181583f [Doc] Guide for Incremental Compilation Workflow (#19109) 2025-06-25 22:06:46 +09:00
c4cf260677 [Perf][CLI] Improve overall startup time (#19941) 2025-06-22 23:11:22 +00:00
16c16301c8 Disable "Forbid direct 'import triton'" check for vllm/triton_utils/importing.py in an extensible way (#19783)
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
2025-06-18 15:08:00 -07:00
4f6c42fa0a [Security] Prevent new imports of (cloud)pickle (#18018)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Aaron Pham <Aaronpham0103@gmail.com>
2025-06-12 10:30:17 +00:00