Skip to content

vllm serv loading MiniCPM-V-2_6 in a very long time with 100% GPU #12402

Answered by yp05327
yp05327 asked this question in Q&A
Discussion options

You must be logged in to vote

Thanks for your response!
I finally found that some options are required, or you will meet this issue.
The required options are in the official document: https://modelbest.feishu.cn/wiki/C2BWw4ZP0iCDy7kkCPCcX2BHnOf

After adding --dtype auto --max-model-len 2048 then it worked!
I don't know why (maybe one of them, maybe both of them), but if you want to get more information, I can help :)

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@yp05327
Comment options

Answer selected by yp05327
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants