[Model][VLM] Add Qwen2.5-Omni model support (thinker only) #15130

fyabc · 2025-03-19T13:58:49Z

This PR adding support for Qwen2.5-Omni model (thinker only).

Requirements

This PR requires this corresponding transformers PR.

pip install git+https://github.com/BakerBunker/transformers.git@qwen25omni

Note: You need to install transformers from source from that branch

Example Usage

# Audio + image + video
python examples/offline_inference/qwen2_5_omni/only_thinker.py -q mixed_modalities

# Read vision and audio inputs from a single video file
# NOTE: V1 engine does not support interleaved modalities yet.
VLLM_USE_V1=0 python examples/offline_inference/qwen2_5_omni/only_thinker.py -q use_audio_in_video

# Process audio inputs
python examples/offline_inference/audio_language.py --model-type qwen2_5_omni

# Process image inputs
python examples/offline_inference/vision_language.py --modality image --model-type qwen2_5_omni

# Process video inputs
python examples/offline_inference/vision_language.py --modality video --model-type qwen2_5_omni

Notes

The whole Qwen2.5-Omni model includes three parts:

thinker: multimodal inputs -> text responses & hidden states
talker: text responses & hidden states from thinker -> speech codes
code2wav (streaming codec decoder): codes -> speech

This PR only implements the thinker part now, it accepts multimodal inputs (images / videos / audios), and generate text responses, similar to other common VLMs.
We have also develped an end-to-end implementation (will be released soon), but due to its significant impact on the vLLM framework architecture, we will not create the related pull request for now.

FIX #15563

github-actions · 2025-03-19T13:58:59Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

DarkLight1337 · 2025-03-19T14:05:21Z

Sorry I don't have time to review in detail tonight, but from a quick glance, can you add this model to the following pages?

Supported Models page
tests/models/registry.py (set is_available_online=False to pass CI until the model repo is released on HF)
tests/models/multimodal/processing/test_common.py
tests/models/decoder_only/vision_language/test_models.py (optional for now)

fyabc · 2025-03-19T14:06:29Z

Sorry I don't have time to review in detail tonight, but from a quick glance, can you add this model to the following pages?

Supported Models page

tests/models/registry.py (set is_available_online=False to pass CI until the model repo is released on HF)

tests/models/multimodal/processing/test_common.py

tests/models/decoder_only/vision_language/test_models.py (optional for now)

OK，I will add them tomorrow.

vllm/model_executor/models/qwen2_5_omni_thinker.py

vllm/transformers_utils/config.py

vllm/model_executor/models/qwen2_5_omni_thinker.py

vllm/model_executor/models/registry.py

vllm/multimodal/inputs.py

yangninghua · 2025-03-21T03:56:52Z

@fyabc Qwen/Qwen2.5-Omni-7B ??

ywang96 · 2025-03-21T05:24:43Z

Sorry for the delay - going to take a look at this PR tonight!

ywang96

Thank you for the contribution! I have left some comments!

vllm/model_executor/models/qwen2_5_omni_thinker.py

vllm/multimodal/inputs.py

vllm/transformers_utils/processor.py

fyabc · 2025-03-24T11:36:32Z

Hi @ywang96 @DarkLight1337 , I update some other examples here, please check the code.

vllm/assets/video.py

tests/models/registry.py

vllm/model_executor/models/qwen2_5_omni_thinker.py

examples/offline_inference/qwen2_5_omni/README.md

Signed-off-by: Roger Wang <[email protected]>

docs/source/models/supported_models.md

vllm/model_executor/models/qwen2_5_omni_thinker.py

vllm/model_executor/models/qwen2_5_vl.py

Signed-off-by: Roger Wang <[email protected]>

ywang96 · 2025-03-31T09:12:32Z

Looks like this PR doesn't work with huggingface/transformers#36752 yet

ERROR 03-31 07:57:21 [core.py:377]   File "/tmp-nvme/myenv/lib/python3.12/site-packages/transformers/processing_utils.py", line 1082, in from_pretrained
ERROR 03-31 07:57:21 [core.py:377]     return cls.from_args_and_dict(args, processor_dict, **kwargs)
ERROR 03-31 07:57:21 [core.py:377]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-31 07:57:21 [core.py:377]   File "/tmp-nvme/myenv/lib/python3.12/site-packages/transformers/processing_utils.py", line 876, in from_args_and_dict
ERROR 03-31 07:57:21 [core.py:377]     processor = cls(*args, **processor_dict)
ERROR 03-31 07:57:21 [core.py:377]                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-31 07:57:21 [core.py:377]   File "/tmp-nvme/myenv/lib/python3.12/site-packages/transformers/models/qwen2_5_omni/processing_qwen2_5_omni.py", line 70, in __init__
ERROR 03-31 07:57:21 [core.py:377]     self.image_token = self.tokenizer.image_token
ERROR 03-31 07:57:21 [core.py:377]                        ^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-31 07:57:21 [core.py:377]   File "/tmp-nvme/myenv/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 1108, in __getattr__
ERROR 03-31 07:57:21 [core.py:377]     raise AttributeError(f"{self.__class__.__name__} has no attribute {key}")
ERROR 03-31 07:57:21 [core.py:377] AttributeError: Qwen2TokenizerFast has no attribute image_token
ERROR 03-31 07:57:21 [core.py:377] 
CRITICAL 03-31 07:57:21 [core_client.py:343] Got fatal signal from worker processes, shutting down. See stack trace above for root cause issue.
[1]    180096 killed     python examples/offline_inference/qwen2_5_omni/only_thinker.py -q

fyabc · 2025-03-31T09:21:32Z

Looks like this PR doesn't work with huggingface/transformers#36752 yet

ERROR 03-31 07:57:21 [core.py:377]   File "/tmp-nvme/myenv/lib/python3.12/site-packages/transformers/processing_utils.py", line 1082, in from_pretrained
ERROR 03-31 07:57:21 [core.py:377]     return cls.from_args_and_dict(args, processor_dict, **kwargs)
ERROR 03-31 07:57:21 [core.py:377]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-31 07:57:21 [core.py:377]   File "/tmp-nvme/myenv/lib/python3.12/site-packages/transformers/processing_utils.py", line 876, in from_args_and_dict
ERROR 03-31 07:57:21 [core.py:377]     processor = cls(*args, **processor_dict)
ERROR 03-31 07:57:21 [core.py:377]                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-31 07:57:21 [core.py:377]   File "/tmp-nvme/myenv/lib/python3.12/site-packages/transformers/models/qwen2_5_omni/processing_qwen2_5_omni.py", line 70, in __init__
ERROR 03-31 07:57:21 [core.py:377]     self.image_token = self.tokenizer.image_token
ERROR 03-31 07:57:21 [core.py:377]                        ^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-31 07:57:21 [core.py:377]   File "/tmp-nvme/myenv/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 1108, in __getattr__
ERROR 03-31 07:57:21 [core.py:377]     raise AttributeError(f"{self.__class__.__name__} has no attribute {key}")
ERROR 03-31 07:57:21 [core.py:377] AttributeError: Qwen2TokenizerFast has no attribute image_token
ERROR 03-31 07:57:21 [core.py:377] 
CRITICAL 03-31 07:57:21 [core_client.py:343] Got fatal signal from worker processes, shutting down. See stack trace above for root cause issue.
[1]    180096 killed     python examples/offline_inference/qwen2_5_omni/only_thinker.py -q

I will take a look at it.

Signed-off-by: fyabc <[email protected]>

Signed-off-by: Roger Wang <[email protected]>

ywang96

Much thanks for making this contribution to vLLM!

I did a few fixes & code changes and confirmed now that the examples of this model work on both V1 and V0 (with use_audio_in_video supported by V0 only), so the only blocker we have is to wait for huggingface/transformers#36752 to be merged!

Signed-off-by: Roger Wang <[email protected]>

mergify · 2025-04-01T16:09:16Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @fyabc.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: fyabc <[email protected]>

Signed-off-by: Roger Wang <[email protected]>

ywang96 · 2025-04-03T07:26:15Z

vllm/model_executor/models/qwen2_5_vl.py

+        self.qkv = MergedColumnParallelLinear(
+            input_size=embed_dim,
+            output_sizes=[projection_size] * 3,
+            bias=True,
+            quant_config=quant_config,
+            prefix=f"{prefix}.qkv",
+        )


After some Investigation it's discovered that this change actually introduced some regression for Qwen2.5VL inference, so I'm blocking this until we resolve the issue.

I found that it works well when tp is 1, but the results are not quite right when tp > 1. I am currently investigating further.

fyabc requested review from WoosukKwon, robertgshaw2-redhat, njhill, ywang96, comaniac, alexm-redhat and DarkLight1337 as code owners March 19, 2025 13:58

mergify bot added documentation Improvements or additions to documentation multi-modality Related to multi-modality (#4194) v1 labels Mar 19, 2025

DarkLight1337 requested review from jeejeelee and Isotr0py March 19, 2025 14:06

DarkLight1337 reviewed Mar 19, 2025

View reviewed changes

vllm/model_executor/models/qwen2_5_omni_thinker.py Outdated Show resolved Hide resolved

Isotr0py reviewed Mar 19, 2025

View reviewed changes

vllm/transformers_utils/config.py Outdated Show resolved Hide resolved

vllm/transformers_utils/config.py Outdated Show resolved Hide resolved

vllm/model_executor/models/qwen2_5_omni_thinker.py Outdated Show resolved Hide resolved

ywang96 self-assigned this Mar 19, 2025

jeejeelee reviewed Mar 19, 2025

View reviewed changes

vllm/model_executor/models/qwen2_5_omni_thinker.py Outdated Show resolved Hide resolved

jeejeelee reviewed Mar 19, 2025

View reviewed changes

vllm/model_executor/models/registry.py Show resolved Hide resolved

DarkLight1337 reviewed Mar 20, 2025

View reviewed changes

vllm/multimodal/inputs.py Show resolved Hide resolved

ywang96 reviewed Mar 21, 2025

View reviewed changes

DarkLight1337 reviewed Mar 26, 2025

View reviewed changes

vllm/assets/video.py Outdated Show resolved Hide resolved

Isotr0py mentioned this pull request Mar 26, 2025

[New Model]: please surport for Qwen/Qwen2.5-Omni-7B #15563

Open

1 task

DarkLight1337 reviewed Mar 27, 2025

View reviewed changes

tests/models/registry.py Outdated Show resolved Hide resolved

DarkLight1337 reviewed Mar 27, 2025

View reviewed changes

vllm/model_executor/models/qwen2_5_omni_thinker.py Show resolved Hide resolved

DarkLight1337 reviewed Mar 27, 2025

View reviewed changes

examples/offline_inference/qwen2_5_omni/README.md Outdated Show resolved Hide resolved

mergify bot added the needs-rebase label Mar 30, 2025

Merge branch 'main' into qwen2_omni_public_v1

a6f878e

mergify bot removed the needs-rebase label Mar 31, 2025

precommit

d3eb60d

Signed-off-by: Roger Wang <[email protected]>

ywang96 reviewed Mar 31, 2025

View reviewed changes

docs/source/models/supported_models.md Outdated Show resolved Hide resolved

vllm/model_executor/models/qwen2_5_omni_thinker.py Outdated Show resolved Hide resolved

vllm/model_executor/models/qwen2_5_vl.py Show resolved Hide resolved

ywang96 and others added 4 commits March 31, 2025 00:24

update V1 interface

0cd5aa8

Signed-off-by: Roger Wang <[email protected]>

add TODO

98226ad

Signed-off-by: Roger Wang <[email protected]>

Update docs/source/models/supported_models.md

6b4c705

assert VLLM_USE_V1=0 audio in video example

286e755

Signed-off-by: Roger Wang <[email protected]>

fyabc and others added 4 commits March 31, 2025 20:10

adapt for transformers PR

9cf9d26

Signed-off-by: fyabc <[email protected]>

multiple fixes

53501f3

Signed-off-by: Roger Wang <[email protected]>

squeeze only one dimension

512f874

Signed-off-by: Roger Wang <[email protected]>

fix squeezing

9c984d0

Signed-off-by: Roger Wang <[email protected]>

ywang96 reviewed Apr 1, 2025

View reviewed changes

ywang96 added 2 commits March 31, 2025 23:06

minor refactoring

3908518

Signed-off-by: Roger Wang <[email protected]>

precommit

864accf

Signed-off-by: Roger Wang <[email protected]>

mergify bot added the needs-rebase label Apr 1, 2025

Merge branch 'main' into qwen2_omni_public_v1

adc5cdf

mergify bot removed the needs-rebase label Apr 1, 2025

This was referenced Apr 2, 2025

[Bug]: TypeError: Unknown image model type: qwen2_5_omni for branch: qwen2_omni_public_v1 #15754

Open

[Bug]: qwen2.5-omni model failed to start #15864

Open

fyabc and others added 5 commits April 2, 2025 18:22

reformat

7108ba3

Signed-off-by: fyabc <[email protected]>

add omni to chat utils

71f96e4

Signed-off-by: Roger Wang <[email protected]>

fix model type

ebd8b88

Signed-off-by: Roger Wang <[email protected]>

fix typo

512bd41

Signed-off-by: Roger Wang <[email protected]>

Merge remote-tracking branch 'upstream/main' into qwen2_omni_public_v1

68004d8

ywang96 requested changes Apr 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model][VLM] Add Qwen2.5-Omni model support (thinker only) #15130

[Model][VLM] Add Qwen2.5-Omni model support (thinker only) #15130

fyabc commented Mar 19, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Mar 19, 2025

DarkLight1337 commented Mar 19, 2025

fyabc commented Mar 19, 2025

yangninghua commented Mar 21, 2025

ywang96 commented Mar 21, 2025

ywang96 left a comment

fyabc commented Mar 24, 2025

ywang96 commented Mar 31, 2025

fyabc commented Mar 31, 2025

ywang96 left a comment •

edited

Loading

mergify bot commented Apr 1, 2025

ywang96 Apr 3, 2025

wulipc Apr 5, 2025

[Model][VLM] Add Qwen2.5-Omni model support (thinker only) #15130

Are you sure you want to change the base?

[Model][VLM] Add Qwen2.5-Omni model support (thinker only) #15130

Conversation

fyabc commented Mar 19, 2025 • edited by github-actions bot Loading

Requirements

Example Usage

Notes

github-actions bot commented Mar 19, 2025

DarkLight1337 commented Mar 19, 2025

fyabc commented Mar 19, 2025

yangninghua commented Mar 21, 2025

ywang96 commented Mar 21, 2025

ywang96 left a comment

Choose a reason for hiding this comment

fyabc commented Mar 24, 2025

ywang96 commented Mar 31, 2025

fyabc commented Mar 31, 2025

ywang96 left a comment • edited Loading

Choose a reason for hiding this comment

mergify bot commented Apr 1, 2025

ywang96 Apr 3, 2025

Choose a reason for hiding this comment

wulipc Apr 5, 2025

Choose a reason for hiding this comment

fyabc commented Mar 19, 2025 •

edited by github-actions bot

Loading

ywang96 left a comment •

edited

Loading