Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIP out of memory #208

Open
Frozen-byte opened this issue Jan 22, 2025 · 4 comments
Open

HIP out of memory #208

Frozen-byte opened this issue Jan 22, 2025 · 4 comments

Comments

@Frozen-byte
Copy link

Trying to run the flux1-dev-Q8_0.gguf on my 6900XT with 16GB of ram.
This should work in theory, right? But somehow I run out of memory.
Do I have a false assumption here or is something wrong with my workflow?
oom_workflow.json

HIP out of memory. Tried to allocate 18.00 MiB. GPU 0 has a total capacity of 15.98 GiB of which 636.00 MiB is free. Of the allocated memory 14.77 GiB is allocated by PyTorch, and 282.37 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_HIP_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) is:issue 

System Information

  • ComfyUI Version: 0.3.12
  • Arguments: main.py --use-pytorch-cross-attention --gpu-only
  • OS: posix
  • Python Version: 3.12.3 (main, Nov 6 2024, 18:32:19) [GCC 13.2.0]
  • Embedded Python: false
  • PyTorch Version: 2.7.0.dev20250112+rocm6.3

Devices

  • Name: cuda:0 AMD Radeon RX 6900 XT : native
    • Type: cuda
    • VRAM Total: 17163091968
    • VRAM Free: 959880192
    • Torch VRAM Total: 16152264704
    • Torch VRAM Free: 292985856
@al-swaiti
Copy link

al-swaiti commented Feb 11, 2025

remove --gpu-only from ur command

@Frozen-byte
Copy link
Author

remove --gpu-only from ur command

This indeed works but will slow down generation significantly and is not the point of my question.

I also tried loading CLIP encoder into RAM to give the flux model all space available in the VRAM, without success I still get OOM errors.

@al-swaiti
Copy link

al-swaiti commented Feb 11, 2025

how did u compare the speed ! , what ever if u insist to use --gpu-only u must have card can hold the size of all models u load !
the model will be loaded on your gpu if its smaller than ur gpu size normally but if it more it will try cpu offload which start using ram
"I also tried loading CLIP encoder into RAM to give the flux model all space available in the VRAM, without success I still get OOM errors." how u did this with --gpu-only

gpu process the text encoders>> convert it to emmbadings >>go to ram >>gpu unload text encoders >>> then load the flux model
give a try with small prompt !

@Frozen-byte
Copy link
Author

Frozen-byte commented Feb 12, 2025

The Q_8 (needs 16gb) or at least the Q_6 (needs 12gb) model should fit into 16gb VRAM, so why do I get OOM Errors, then?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants