Is it normal that Qwen/Qwen2.5-VL-7B-Instruct takes nearly 24 GB graphic memory? #16133

ChenZhongPu · 2025-04-06T16:19:07Z

ChenZhongPu
Apr 6, 2025

I used the command vllm serve Qwen/Qwen2.5-VL-7B-Instruct , in which "torch_dtype": "bfloat16". But it shows that it takes nearly 24 GB graphic memory. Is it normal?

In my understanding, it would take only about 14GB graphic memory.

gameofdimension · 2025-04-07T01:45:45Z

gameofdimension
Apr 7, 2025

You should factor in the KV cache

1 reply

ChenZhongPu Apr 7, 2025
Author

Thanks. Can you please recommend some resources regarding estimating memory overhead in vllm?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Is it normal that Qwen/Qwen2.5-VL-7B-Instruct takes nearly 24 GB graphic memory? #16133

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Is it normal that Qwen/Qwen2.5-VL-7B-Instruct takes nearly 24 GB graphic memory? #16133

Uh oh!

ChenZhongPu Apr 6, 2025

Replies: 1 comment · 1 reply

Uh oh!

gameofdimension Apr 7, 2025

Uh oh!

ChenZhongPu Apr 7, 2025 Author

ChenZhongPu
Apr 6, 2025

Replies: 1 comment 1 reply

gameofdimension
Apr 7, 2025

ChenZhongPu Apr 7, 2025
Author