When to use single ip adapter vs multiple #8714

darshats · 2024-06-26T12:17:55Z

darshats
Jun 26, 2024

Hi,
This question is related to #8626 .

The "two girls" example in the IP Adapter documentation shows how to use two masks, and then inject characteristics of two different images into those masked areas via single ip adapter. We could also do this by using two ip adapters, one for each masked area.
Is there a recommendation for which would be right one?

My experience was that with the single ip adapter, the scale specification to use is a list of list of floats like:
pipe.set_ip_adapter_scale([[0.9,0.9]])
It seems we cannot use the instantstyle specification in this such that we can control each masked area differently.

On the other hand using two ipadapters like:
pipe.load_ip_adapter("h94/IP-Adapter", subfolder="sdxl_models", weight_name=["ip-adapter-plus_sdxl_vit-h.safetensors", "ip-adapter-plus_sdxl_vit-h.safetensors"], use_safetensors=True)
we can specify different scale configs to each masked area:

scale_config_fg = {
        "down": {"block_2": [0.0, 1.0]},
        "up": {"block_0": [0.0, 0.9, 0.0]},
    }
scale_config_bg = {
        "down": 1,
    }
pipe.set_ip_adapter_scale([scale_config_fg, scale_config_bg])

This does seem more flexible. So two questions:

Why is instantstyle not supported in single adapter, two masks case? Internally the implementation of set_ip_adapter_scale method seems to do the same thing whether its list of list of scalar (single adapter) or list of instantstyle configs via the _maybe_expand_lora_scales method.
Is there a down side to using multi adapters than with single adapter?

Thanks,
Darshat

asomoza · 2024-06-26T18:06:17Z

asomoza
Jun 26, 2024
Maintainer

Why is instantstyle not supported in single adapter, two masks case? Internally the implementation of set_ip_adapter_scale method seems to do the same thing whether its list of list of scalar (single adapter) or list of instantstyle configs via the _maybe_expand_lora_scales method.

This is because the instanstyle applies the IP Adapter to specific attention blocks, this is for the whole IP Adapter and not for each image while if you pass a list of scales this just scales the embeddings for each image. With the current implementation it's impossible to apply the embeddings of each image to specific attention blocks.

Is there a down side to using multi adapters than with single adapter?

yes, you will use more RAM and it will be slower to generate. This is the reason why we added the function to be able to load multiple masks with one IP Adapter.

If you're interested in reading a more detailed discussion, you can read it here.

0 replies

darshats · 2024-06-28T08:54:20Z

darshats
Jun 28, 2024
Author

Thanks for the details pointer @asomoza

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

When to use single ip adapter vs multiple #8714

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

When to use single ip adapter vs multiple #8714

Uh oh!

Uh oh!

darshats Jun 26, 2024

Replies: 2 comments

Uh oh!

Uh oh!

asomoza Jun 26, 2024 Maintainer

Uh oh!

darshats Jun 28, 2024 Author

darshats
Jun 26, 2024

asomoza
Jun 26, 2024
Maintainer

darshats
Jun 28, 2024
Author