Replies: 2 comments 7 replies
-
What you're asking sounds to me like the |
Beta Was this translation helpful? Give feedback.
-
We experimented with merging methods for adapters here: https://huggingface.co/blog/peft_merging I would suggest experimenting with the functions from peft for now and reporting any performance related improvements from that (if any). Currently, at least, it is not clear to me if there needs to be modifications made to either diffusers or peft for this. |
Beta Was this translation helpful? Give feedback.
-
Currently using multiple LoRa modules at once can severely impact performance. While fusing to the base module weights is an option, it's lossy in an irrecoverable manner short of duplicating the entire network into RAM.
I was discussing with @bghira and @sayakpaul about the possibility of fusing similar modules, such that where you have two loras A, B, the computation could be
M + (AB)
rather thanM + A + B
every step. You would still need to keep a copy of the original lora modules, but this is much more viable than copying something a 20GiB base model.Beta Was this translation helpful? Give feedback.
All reactions