Files
vllm-ascend/vllm_ascend
rjg-lyh 47eaf622fe [v0.9.1][bugfix] disable the chunked prefill feature in Non-MLA LLMs (#2659)
### What this PR does / why we need it?
This PR enforces the forcible disabling of the chunked prefill feature
in Non-MLA models, as the performance of operators supporting this
functionality is currently suboptimal.
At the same time, in engine v1 mode, the ascend scheduler is forcibly
enabled, and the `enable_chunked_prefill` specified by the user in
additional_config is disabled.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
CI passed with new added/existing test.

Signed-off-by: rjg-lyh <1318825571@qq.com>
2025-09-03 15:27:43 +08:00
..
2025-04-22 08:57:25 +08:00
2025-08-25 09:40:26 +08:00