bugfix for mooncake (#3535)

### What this PR does / why we need it?
bugfix for mooncake, remove useless judgement.

### How was this patch tested?
by ci

- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

Signed-off-by: zouyida2052 <zouyida2002@gmail.com>
This commit is contained in:
zouyida2052
2025-10-19 17:06:05 +08:00
committed by GitHub
parent 1e78ecbad6
commit 58a37ce189

View File

@ -943,7 +943,7 @@ class MooncakeConnectorWorker:
# kv_transfer variables
self.vllm_config = vllm_config
self.block_size = vllm_config.cache_config.block_size
if self.vllm_config.model_config.is_deepseek_mla or self.use_sparse:
if self.vllm_config.model_config.is_deepseek_mla:
self.num_need_pulls = 1
else:
num_d_block_heads = max(1,