-
Notifications
You must be signed in to change notification settings - Fork 10.4k
Open
Labels
Potential BugUser is reporting a bug. This should be tested.User is reporting a bug. This should be tested.
Description
Custom Node Testing
- I have tried disabling custom nodes and the issue persists (see how to disable custom nodes if you need help)
Expected Behavior
See #10302 (comment)
rocm 6.4 + flash-attn performs significantly better for image upscaling with cudnn left on. This regressed after #10302
cudnn default: ImageUpscaleWithModel(4x-UltraSharp.pth) 1.84s/it
Actual Behavior
cudnn = False: ImageUpscaleWithModel(4x-UltraSharp.pth) 11.95s/it
Steps to Reproduce
Run an image upscale workflow with rocm 6.4 + flash-attn.
Debug Logs
Total VRAM 16368 MB, total RAM 64217 MB
pytorch version: 2.9.0+rocm6.4
AMD arch: gfx1100
ROCm version: (6, 4)
Set vram state to: NORMAL_VRAM
Device: cuda:0 AMD Radeon RX 7900 GRE : native
Using Flash Attention
Python version: 3.13.7 (main, Aug 15 2025, 12:34:02) [GCC 15.2.1 20250813]
ComfyUI version: 0.3.66
ComfyUI frontend version: 1.28.7
Other
No response
Metadata
Metadata
Assignees
Labels
Potential BugUser is reporting a bug. This should be tested.User is reporting a bug. This should be tested.