-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VLM] Qwen2.5-VL #12604
base: main
Are you sure you want to change the base?
[VLM] Qwen2.5-VL #12604
Conversation
Signed-off-by: Roger Wang <[email protected]>
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
Co-authored-by: Yixuan Qiao <[email protected]> Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Qwen2 5 vl new vit
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
I have updated it for you. |
Signed-off-by: DarkLight1337 <[email protected]>
I will verify lora asap |
This reverts commit b50268d.
Signed-off-by: Roger Wang <[email protected]>
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: DarkLight1337 <[email protected]>
I have build vllm and this branch from source and I do get the following error:
I am serving the model as the following: vllm serve Qwen/Qwen2.5-VL-72B-Instruct --quantization bitsandbytes --load-format bitsandbytes --pipeline_parallel_size 2 --max_model_len 10000 |
You need to install the latest code of |
After doing this, is the branch already usable? |
I am on
|
I was able to make it work in a fresh EC2 instance with NVIDIA drivers with the following:
|
@jjovalle99 Thanks for testing this branch! I also strongly encourage you to try out our V1 re-arch (by simply specifying |
There are still some minor issues with the lora part, which can be resolved with a separate PR later. |
@jeejeelee the kind of issues that prevent the lora from working? |
Going to merge main again and kick off a fresh new CI to make sure everything looks good, then we should be able to merge this PR!
Will update the doc to indicate this. |
Signed-off-by: Roger Wang <[email protected]>
I can infer the Usage 1: command linevllm serve Qwen/Qwen2.5-VL-72B-Instruct --port 8000 --host 0.0.0.0 --dtype bfloat16 --tensor-parallel-size 4 Usage 2: pure python function callllm = LLM(
model=model_dir,
limit_mm_per_prompt={"image": 10, "video": 10},
tensor_parallel_size=4,
) |
@PkuDavidGuan Can you share the error message you get? Edit: FWIW - I was able to run |
I am not able to run this my machine A100-80G: paperspace@psv5mz18va6k:~/venv$ uv venv --python 3.12.8 qwen25 paperspace@psv5mz18va6k:~/venv$ source qwen25/bin/activate (qwen25) paperspace@psv5mz18va6k:~/venv$ uv pip install "git+https://github.com/huggingface/transformers"
(qwen25) paperspace@psv5mz18va6k:~/venv$ uv pip install "git+https://github.com/ywang96/vllm@qwen2_5_vl" error: The build backend returned an error [stderr] hint: This usually indicates a problem with the package or the build environment. Additional information: +---------------------------------------------------------------------------------------+ I able to run inference qwen2.5-vl 7b using HF but not above vllm branch |
@rstone3017 I don't think your issue is with this PR, but something wrong with your local environment in particular
|
FIXES: #12486, #12532
TODO:
To run this model before transformers 4.49 release, install transformers from source
pip install git+https://github.com/huggingface/transformers
Co-authored-by: @yixqiao(UC Berkeley) @wulipc(Qwen Team)