Replies: 2 comments
-
This is because the instanstyle applies the IP Adapter to specific attention blocks, this is for the whole IP Adapter and not for each image while if you pass a list of scales this just scales the embeddings for each image. With the current implementation it's impossible to apply the embeddings of each image to specific attention blocks.
yes, you will use more RAM and it will be slower to generate. This is the reason why we added the function to be able to load multiple masks with one IP Adapter. If you're interested in reading a more detailed discussion, you can read it here. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the details pointer @asomoza |
Beta Was this translation helpful? Give feedback.
-
Hi,
This question is related to #8626 .
The "two girls" example in the IP Adapter documentation shows how to use two masks, and then inject characteristics of two different images into those masked areas via single ip adapter. We could also do this by using two ip adapters, one for each masked area.
Is there a recommendation for which would be right one?
My experience was that with the single ip adapter, the scale specification to use is a list of list of floats like:
pipe.set_ip_adapter_scale([[0.9,0.9]])
It seems we cannot use the instantstyle specification in this such that we can control each masked area differently.
On the other hand using two ipadapters like:
pipe.load_ip_adapter("h94/IP-Adapter", subfolder="sdxl_models", weight_name=["ip-adapter-plus_sdxl_vit-h.safetensors", "ip-adapter-plus_sdxl_vit-h.safetensors"], use_safetensors=True)
we can specify different scale configs to each masked area:
This does seem more flexible. So two questions:
set_ip_adapter_scale
method seems to do the same thing whether its list of list of scalar (single adapter) or list of instantstyle configs via the_maybe_expand_lora_scales
method.Thanks,
Darshat
Beta Was this translation helpful? Give feedback.
All reactions