Skip to content

[pull] master from comfyanonymous:master#298

Merged
pull[bot] merged 2 commits intocode:masterfrom
Comfy-Org:master
Sep 17, 2025
Merged

[pull] master from comfyanonymous:master#298
pull[bot] merged 2 commits intocode:masterfrom
Comfy-Org:master

Conversation

@pull
Copy link

@pull pull bot commented Sep 17, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

rattus128 and others added 2 commits September 16, 2025 19:21
* flux: Do the xq and xk ropes one at a time

This was doing independendent interleaved tensor math on the q and k
tensors, leading to the holding of more than the minimum intermediates
in VRAM. On a bad day, it would VRAM OOM on xk intermediates.

Do everything q and then everything k, so torch can garbage collect
all of qs intermediates before k allocates its intermediates.

This reduces peak VRAM usage for some WAN2.2 inferences (at least).

* wan: Optimize qkv intermediates on attention

As commented. The former logic computed independent pieces of QKV in
parallel which help more inference intermediates in VRAM spiking
VRAM usage. Fully roping Q and garbage collecting the intermediates
before touching K reduces the peak inference VRAM usage.
@pull pull bot locked and limited conversation to collaborators Sep 17, 2025
@pull pull bot added the ⤵️ pull label Sep 17, 2025
@pull pull bot merged commit 9288c78 into code:master Sep 17, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments