Files
vllm/docs/features
2025-10-02 10:35:51 -07:00
..

Features

Compatibility Matrix

The tables below show mutually exclusive features and the support on some hardware.

The symbols used have the following meanings:

  • = Full compatibility
  • 🟠 = Partial compatibility
  • = No compatibility
  • = Unknown or TBD

!!! note Check the or 🟠 with links to see tracking issue for unsupported feature/hardware combination.

Feature x Feature

<style> td:not(:first-child) { text-align: center !important; } td { padding: 0.5rem !important; white-space: nowrap; } th { padding: 0.5rem !important; min-width: 0 !important; } th:not(:first-child) { writing-mode: vertical-lr; transform: rotate(180deg) } </style>
Feature [CP][chunked-prefill] APC LoRA SD CUDA graph pooling enc-dec logP prmpt logP async output multi-step mm best-of beam-search prompt-embeds
[CP][chunked-prefill]
APC
LoRA
SD
CUDA graph
pooling 🟠* 🟠*
enc-dec
logP
prmpt logP
async output
multi-step
mm 🟠^
best-of
beam-search
prompt-embeds ? ? ? ? ?

* Chunked prefill and prefix caching are only applicable to last-token pooling.
^ LoRA is only applicable to the language backbone of multimodal models.

{ #feature-x-hardware }

Feature x Hardware

Feature Volta Turing Ampere Ada Hopper CPU AMD TPU
[CP][chunked-prefill]
APC
LoRA
SD
CUDA graph
pooling
enc-dec
mm
logP
prmpt logP
async output
multi-step
best-of
beam-search
prompt-embeds ?