HIP out of memory #208

Frozen-byte · 2025-01-22T15:18:59Z

Trying to run the flux1-dev-Q8_0.gguf on my 6900XT with 16GB of ram.
This should work in theory, right? But somehow I run out of memory.
Do I have a false assumption here or is something wrong with my workflow?
oom_workflow.json

HIP out of memory. Tried to allocate 18.00 MiB. GPU 0 has a total capacity of 15.98 GiB of which 636.00 MiB is free. Of the allocated memory 14.77 GiB is allocated by PyTorch, and 282.37 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_HIP_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) is:issue

System Information

ComfyUI Version: 0.3.12
Arguments: main.py --use-pytorch-cross-attention --gpu-only
OS: posix
Python Version: 3.12.3 (main, Nov 6 2024, 18:32:19) [GCC 13.2.0]
Embedded Python: false
PyTorch Version: 2.7.0.dev20250112+rocm6.3

Devices

Name: cuda:0 AMD Radeon RX 6900 XT : native
- Type: cuda
- VRAM Total: 17163091968
- VRAM Free: 959880192
- Torch VRAM Total: 16152264704
- Torch VRAM Free: 292985856

The text was updated successfully, but these errors were encountered:

al-swaiti · 2025-02-11T16:06:25Z

remove --gpu-only from ur command

Frozen-byte · 2025-02-11T16:10:02Z

remove --gpu-only from ur command

This indeed works but will slow down generation significantly and is not the point of my question.

I also tried loading CLIP encoder into RAM to give the flux model all space available in the VRAM, without success I still get OOM errors.

al-swaiti · 2025-02-11T21:26:27Z

how did u compare the speed ! , what ever if u insist to use --gpu-only u must have card can hold the size of all models u load !
the model will be loaded on your gpu if its smaller than ur gpu size normally but if it more it will try cpu offload which start using ram
"I also tried loading CLIP encoder into RAM to give the flux model all space available in the VRAM, without success I still get OOM errors." how u did this with --gpu-only

gpu process the text encoders>> convert it to emmbadings >>go to ram >>gpu unload text encoders >>> then load the flux model
give a try with small prompt !

Frozen-byte · 2025-02-12T12:37:23Z

The Q_8 (needs 16gb) or at least the Q_6 (needs 12gb) model should fit into 16gb VRAM, so why do I get OOM Errors, then?

Frozen-byte mentioned this issue Jan 23, 2025

When loading Q5_K_M T5 with 11GB Fp8 Flux model, I get OOM sometimes #172

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HIP out of memory #208

HIP out of memory #208

Frozen-byte commented Jan 22, 2025

al-swaiti commented Feb 11, 2025 •

edited

Loading

Frozen-byte commented Feb 11, 2025

al-swaiti commented Feb 11, 2025 •

edited

Loading

Frozen-byte commented Feb 12, 2025 •

edited

Loading

HIP out of memory #208

HIP out of memory #208

Comments

Frozen-byte commented Jan 22, 2025

System Information

Devices

al-swaiti commented Feb 11, 2025 • edited Loading

Frozen-byte commented Feb 11, 2025

al-swaiti commented Feb 11, 2025 • edited Loading

Frozen-byte commented Feb 12, 2025 • edited Loading

al-swaiti commented Feb 11, 2025 •

edited

Loading

al-swaiti commented Feb 11, 2025 •

edited

Loading

Frozen-byte commented Feb 12, 2025 •

edited

Loading