mirror of
https://github.com/vllm-project/vllm.git
synced 2025-10-20 14:53:52 +08:00
[Doc] Update notes for H2O-VL and Gemma3 (#17219)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
This commit is contained in:
@ -1118,11 +1118,6 @@ See [this page](#generative-models) for more information on how to use generativ
|
||||
<sup>E</sup> Pre-computed embeddings can be inputted for this modality.
|
||||
<sup>+</sup> Multiple items can be inputted per text prompt for this modality.
|
||||
|
||||
:::{important}
|
||||
Pan-and-scan image pre-processing is currently supported on V0 (but not V1).
|
||||
You can enable it by passing `--mm-processor-kwargs '{"do_pan_and_scan": true}'`.
|
||||
:::
|
||||
|
||||
:::{warning}
|
||||
Both V0 and V1 support `Gemma3ForConditionalGeneration` for text-only inputs.
|
||||
However, there are differences in how they handle text + image inputs:
|
||||
@ -1142,7 +1137,7 @@ This limitation exists because the model's mixed attention pattern (bidirectiona
|
||||
:::
|
||||
|
||||
:::{note}
|
||||
`h2oai/h2ovl-mississippi-2b` will be available in V1 once we support backends other than FlashAttention.
|
||||
`h2oai/h2ovl-mississippi-2b` will be available in V1 once we support head size 80.
|
||||
:::
|
||||
|
||||
:::{note}
|
||||
|
Reference in New Issue
Block a user