Skip to content

Commit 37e3806

Browse files
authored
[Bugfix] Make Gemma3 MM V0 only for now (#14971)
Signed-off-by: Roger Wang <[email protected]>
1 parent c0efdd6 commit 37e3806

File tree

2 files changed

+6
-3
lines changed

2 files changed

+6
-3
lines changed

docs/source/models/supported_models.md

+4-1
Original file line numberDiff line numberDiff line change
@@ -763,7 +763,7 @@ See [this page](#generative-models) for more information on how to use generativ
763763
* `google/gemma-3-4b-it`, `google/gemma-3-27b-it`, etc.
764764
* ✅︎
765765
* ✅︎
766-
* ⚠️
766+
*
767767
- * `GLM4VForCausalLM`<sup>^</sup>
768768
* GLM-4V
769769
* T + I
@@ -948,8 +948,11 @@ V1 currently uses a simplified attention pattern:
948948
- Uses causal attention for all tokens, including image tokens
949949
- Generates reasonable outputs but does not match the original model's attention for text + image inputs
950950
- Will be updated in the future to support the correct behavior
951+
- Does not support `"do_pan_and_scan": True`
951952

952953
This limitation exists because the model's mixed attention pattern (bidirectional for images, causal otherwise) is not yet supported by vLLM's attention backends.
954+
955+
For these reasons, `Gemma3ForConditionalGeneration` is supported only on V0 at the moment.
953956
:::
954957

955958
:::{note}

vllm/model_executor/models/gemma3_mm.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
from vllm.sequence import IntermediateTensors
2626

2727
from .interfaces import (MultiModalEmbeddings, SupportsLoRA,
28-
SupportsMultiModal, SupportsPP)
28+
SupportsMultiModal, SupportsPP, SupportsV0Only)
2929
from .siglip import SiglipVisionModel
3030
from .utils import (AutoWeightsLoader, flatten_bn, init_vllm_registered_model,
3131
maybe_prefix, merge_multimodal_embeddings)
@@ -374,7 +374,7 @@ def forward(self, vision_outputs: torch.Tensor):
374374
info=Gemma3ProcessingInfo,
375375
dummy_inputs=Gemma3DummyInputsBuilder)
376376
class Gemma3ForConditionalGeneration(nn.Module, SupportsMultiModal, SupportsPP,
377-
SupportsLoRA):
377+
SupportsLoRA, SupportsV0Only):
378378
packed_modules_mapping = {
379379
"qkv_proj": [
380380
"q_proj",

0 commit comments

Comments
 (0)