Support to serve vLLM on Kubernetes with LWS (#4829)

Signed-off-by: kerthcet <kerthcet@gmail.com>
This commit is contained in:
Kante Yin
2024-05-17 07:37:29 +08:00
committed by GitHub
parent 9a31a817a8
commit 8e7fb5d43a
2 changed files with 13 additions and 0 deletions

View File

@ -0,0 +1,12 @@
.. _deploying_with_lws:
Deploying with LWS
============================
LeaderWorkerSet (LWS) is a Kubernetes API that aims to address common deployment patterns of AI/ML inference workloads.
A major use case is for multi-host/multi-node distributed inference.
vLLM can be deployed with `LWS <https://github.com/kubernetes-sigs/lws>`_ on Kubernetes for distributed model serving.
Please see `this guide <https://github.com/kubernetes-sigs/lws/tree/main/docs/examples/vllm>`_ for more details on
deploying vLLM on Kubernetes using LWS.

View File

@ -8,4 +8,5 @@ Integrations
deploying_with_kserve
deploying_with_triton
deploying_with_bentoml
deploying_with_lws
serving_with_langchain