Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM error with Hunyuan Video GGUF #187

Open
JorgeR81 opened this issue Dec 22, 2024 · 5 comments
Open

OOM error with Hunyuan Video GGUF #187

JorgeR81 opened this issue Dec 22, 2024 · 5 comments

Comments

@JorgeR81
Copy link

JorgeR81 commented Dec 22, 2024

I have an OOM error when trying to run Hunyuan Video GGUF Q6_K

torch.cuda.OutOfMemoryError: Allocation on device

Got an OOM, unloading all loaded models.

j01


I tried the ComfyUI example workflow, with the llava FP8 encoder.
I just replaced the native loader with the GGUF loader. 

wf

 


I haven't tried the regular Hunyuan Video model. 

I have an old system, but I can run LTX-Video and Flux, even the full size models. 

about

@city96
Copy link
Owner

city96 commented Dec 22, 2024

What resolution/frame count? Might just be flat out OOMing on runtime costs alone considering it says it has 7GBs free (so barely any part of the model is loaded, most of it probably in lowvram mode.)

Could try play with the reserved VRAM amount (launch flag) or close VRAM intensive stuff/switch the desktop to the iGPU, though idk what res/frames you can get on a 8GB card, it's borderline unusable for me on a 10GB one lol

@JorgeR81
Copy link
Author

JorgeR81 commented Dec 22, 2024

I tried the default settings in the example workflow.
Reserve VRAM didn't help.

I'm downloading the Q4K_M versions Hunyuan Video and llava encoder, to see if it works.

LTX-Video worked so well, that I thought I could give this a shot :)


zx1

@city96
Copy link
Owner

city96 commented Dec 22, 2024

Pushed a commit that might fix it. Keyword might lol.

@JorgeR81
Copy link
Author

JorgeR81 commented Dec 22, 2024

Did some more testing:

  • Hunyuan Q4_K_M
  • llava Q4_K_M

It OOM's right away, loading the llava model, in the CLIPTexEncode stage, even before loading the Hunyuan model. 
This happened before and after the latest updates. 

c1


I also tried:

  • hunyuan Q4_K_M
  • llava FP8

It loads all the models, and it starts the generation process.
But it stall's in the first step, for a while, until it goes OOM's
Reserve VRAM also didn't help.


So maybe it's just too much for my old GPU.

But there may also be a separate issue with the GGUF llava encoder.

@doogyhatts
Copy link

doogyhatts commented Jan 17, 2025

I tried the default settings in the example workflow.

I am using gguf Q8 model, 640x480 resolution, frame count 65, tile_size 128, overlap 32, fps 16.
Add fastvideo lora and sage attention.
I also have only 8gb vram, so I turned on the system memory fallback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants