-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for control-lora #10686
base: main
Are you sure you want to change the base?
Support for control-lora #10686
Conversation
My solution was referenced from: https://github.com/Mikubill/sd-webui-controlnet/blob/main/scripts/controlnet_lora.py and https://github.com/HighCWu/control-lora-v2/blob/master/models/control_lora.py, but it differs in several ways. Here are my observations and solutions:
|
else: | ||
raise ValueError | ||
|
||
config = ControlNetModel.load_config("xinsir/controlnet-canny-sdxl-1.0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we load from https://huggingface.co/stabilityai/control-lora as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cannot, because control-lora does not provide a config.json file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for starting this!
In order to get this PR ready for reviews, we would need to:
- Use
peft
for all things LoRA instead of having to rely on things likeLinearWithLoRA
. - We should be able to run the LoRA conversion on the checkpoint during loading like how it's done for other LoRA checkpoints. Here is an example.
- Ideally, users should be able to call
ControlNetModel.load_lora_adapter()
(method reference) on a state dict and we run the conversion first if needed and then take rest of the steps.
The higher-level design I am thinking of goes as follows:
controlnet = # initialize ControlNet model.
# load ControlNet-LoRA into `controlnet`
controlnet.load_lora_adapter("stabilityai/control-lora", weight_name="...")
pipeline = # initialize ControlNet pipeline.
...
LMK if this makes sense. Happy to elaborate further.
I hold a reserved attitude because I have observed that the required memory for control-lora is less than that for controlnet, yet running it in this manner requires at least as much memory as controlnet. I want control-lora not only to be a lora but also to be a memory-saving model. Of course, the existing code cannot handle this yet, and it will require future improvements. |
If we do incorporate |
I once observed while running sd-controlnet-webui that the peak VRAM usage was 5.9GB when using sd1.5 controlnet, and it was 4.7GB when using sd1.5 control-lora. Clearly, sd-controlnet-webui employs some method to reuse weights rather than simply merging the lora weights on top of controlnet. Can loading controlnet in this manner provide such VRAM optimization? |
I am quite sure we can achieve those numbers without having to do too much given the recent set of optimizations we have shipped and are going to ship.
We're not merging the LoRA weights into the base model when initially loading the LoRA checkpoint. That goes against our LoRA design. Users can always merge the LoRA params into the base model params after loading the LoRA params but that is not the default behaviour. |
Good, resolving this concern, I believe such a design is reasonable. It is simpler and more user-friendly. |
Appreciate the understanding. LMK if you would like to take a crack at the suggestions I provided above. |
I encountered a problem: after running the command
You can read the code. Does this method meet your expectations? |
@sayakpaul Can you help me solve this problem? |
Can you try to help me understand why |
My design is as follows: the core code is located in |
diffusers/src/diffusers/loaders/peft.py Line 141 in 3e35f56
|
I'm having trouble converting the prefix of control-lora into the diffusers format. The prefix of control-lora is in the sd format, while the loaded controlnet is in the diffusers format. I can't find a clean and efficient way to achieve the conversion. Could you provide some guidance? @sayakpaul |
You could refer to the following function to get a sense of how we do it for other non-diffusers LoRAs:
Would this help? |
I tried to load Control-LORA in the |
I think the easiest might to have a class for Control LoRA overridden from |
Is there any example? |
There is none but here is how it may look like in terms of pseudo-code: class ControlLoRAMixin(PeftAdapterMixin):
def load_lora_adapter(...):
state_dict = # convert the state dict from SD format to peft format.
...
# proceed with the rest of the logic. |
Okay, I will give it a try. |
from diffusers import (
StableDiffusionXLControlNetPipeline,
StableDiffusionControlNetPipeline,
ControlNetModel,
UNet2DConditionModel,
)
import torch
# pipe_id = "stabilityai/stable-diffusion-xl-base-1.0"
# lora_id = "stabilityai/control-lora"
# lora_filename = "control-LoRAs-rank128/control-lora-sketch-rank128-metadata.safetensors"
pipe_id = "stable-diffusion-v1-5/stable-diffusion-v1-5"
lora_id = "comfyanonymous/ControlNet-v1-1_fp16_safetensors"
lora_filename = "control_lora_rank128_v11p_sd15_openpose_fp16.safetensors"
unet = UNet2DConditionModel.from_pretrained(pipe_id, subfolder="unet", torch_dtype=torch.bfloat16).to("cuda")
controlnet = ControlNetModel.from_unet(unet).to(device="cuda", dtype=torch.bfloat16)
controlnet.load_lora_adapter(lora_id, weight_name=lora_filename, controlnet_config=controlnet.config)
from diffusers import AutoencoderKL
from diffusers.utils import load_image, make_image_grid
from PIL import Image
import numpy as np
import cv2
prompt = "chef in the kitchen"
negative_prompt = "low quality, bad quality, sketches"
image = load_image("https://huggingface.co/lllyasviel/control_v11p_sd15_openpose/resolve/main/images/control.png")
controlnet_conditioning_scale = 1.0 # recommended for good generalization
vae = AutoencoderKL.from_pretrained("stabilityai/sd-vae-ft-mse", torch_dtype=torch.bfloat16)
pipe = StableDiffusionControlNetPipeline.from_pretrained(
pipe_id,
unet=unet,
controlnet=controlnet,
vae=vae,
torch_dtype=torch.bfloat16,
safety_checker=None,
).to("cuda")
# image = image.convert("L")
image = np.array(image)
# image = cv2.Canny(image, 100, 200)
# image = image[:, :, None]
# image = np.concatenate([image, image, image], axis=2)
image = Image.fromarray(image)
images = pipe(
prompt, negative_prompt=negative_prompt, image=image,
controlnet_conditioning_scale=controlnet_conditioning_scale,
num_images_per_prompt=4
).images
final_image = [image] + images
grid = make_image_grid(final_image, 1, 5)
grid.save(f"sketch.png") This code is also effective for the SD1.5 model. |
Nice, thanks for your hard work on this! Btw, did you mean to push any pending changes to this PR? Are we getting good results with Depth Control, as well? |
No, in the previous text, I just wanted to clarify two points: 1. Currently, the depth and sketch models in stabilityai/control-lora exhibit different behavior compared to the canny and recolor models, and I have not been able to find a proper example demonstrating that these two models (depth and sketch) can work effectively in my code (whereas the canny and recolor models are suitable for my code). 2. My code is also effective for the SD1.5 model and control-lora models compatible with SD1.5 (including the depth and sketch models), without requiring any changes to the code. In summary, I believe the code in this PR successfully implements the functionality of running control-lora within diffusers, but a good example is needed to fully utilize the depth and sketch models from stabilityai/control-lora. |
In stabilityai/control-lora, no; however, for the control-lora corresponding to the SD1.5 model (comfyanonymous/ControlNet-v1-1_fp16_safetensors), it is possible. |
Thanks for your hard work. I will try to dedicate some time to button up the PR and open a PR to your branch. Thanks a lot again! |
You're welcome. I'm also glad to see this work eventually integrated into the diffusers library, making it more accessible for others to use. |
Do we have any guide on how to train controlnet in the form of lora? I am specifically interested for Flux based control lora. Any information is appreciated. |
I once wrote a repository for training control-lora lavinal712/control-lora-v3, in which I referenced other excellent repositories. I think you can find methods from it. However, the structure of Flux control-lora is different from that of control-lora, so you should carefully distinguish between them. |
Thanks @lavinal712. I will take a look. |
@lavinal712 I am sorry about the delay on my end. Expect updates soon. |
"controlnet_cond_embedding.conv_in.bias": "controlnet_cond_embedding.conv_in.modules_to_save.bias", | ||
"controlnet_cond_embedding.conv_out.bias": "controlnet_cond_embedding.conv_out.modules_to_save.bias", | ||
**{f"controlnet_cond_embedding.blocks.{i}.bias": f"controlnet_cond_embedding.blocks.{i}.modules_to_save.bias" for i in range(6)}, | ||
**{f"controlnet_down_blocks.{i}.bias": f"controlnet_down_blocks.{i}.modules_to_save.bias" for i in range(9)}, | ||
"controlnet_mid_block.bias": "controlnet_mid_block.modules_to_save.bias", | ||
".norm.bias": ".norm.modules_to_save.bias", | ||
**{f".norm{i}.bias": f".norm{i}.modules_to_save.bias" for i in range(1, 4)}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have to include modules_to_save
here in the conversion? @lavinal712
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to function auto-inference issues, I don't recall which function it was, but in my testing, PeftAdapterMixin failed to convert control-lora.
Hello @lavinal712. I applied the following changes to your PR: Diffdiff --git a/src/diffusers/loaders/__init__.py b/src/diffusers/loaders/__init__.py
index b0cd85dff..e81233716 100644
--- a/src/diffusers/loaders/__init__.py
+++ b/src/diffusers/loaders/__init__.py
@@ -84,7 +84,7 @@ if is_torch_available():
"SD3IPAdapterMixin",
]
-_import_structure["peft"] = ["PeftAdapterMixin", "ControlLoRAMixin"]
+_import_structure["peft"] = ["PeftAdapterMixin"]
if TYPE_CHECKING or DIFFUSERS_SLOW_IMPORT:
@@ -94,7 +94,7 @@ if TYPE_CHECKING or DIFFUSERS_SLOW_IMPORT:
from .transformer_sd3 import SD3Transformer2DLoadersMixin
from .unet import UNet2DConditionLoadersMixin
from .utils import AttnProcsLayers
- from .peft import ControlLoRAMixin
+ # from .peft import ControlLoRAMixin
if is_transformers_available():
from .ip_adapter import (
diff --git a/src/diffusers/loaders/peft.py b/src/diffusers/loaders/peft.py
index 91e931f44..c5cf6a160 100644
--- a/src/diffusers/loaders/peft.py
+++ b/src/diffusers/loaders/peft.py
@@ -245,6 +245,12 @@ class PeftAdapterMixin:
f"Adapter name {adapter_name} already in use in the model - please select a new adapter name."
)
+ # Control LoRA from SAI is different from BFL Control LoRA
+ # https://huggingface.co/stabilityai/control-lora/
+ if "lora_controlnet" in state_dict:
+ del state_dict["lora_controlnet"]
+ state_dict = convert_control_lora_state_dict_to_peft(state_dict)
+
# check with first key if is not in peft format
first_key = next(iter(state_dict.keys()))
if "lora_A" not in first_key:
@@ -262,9 +268,12 @@ class PeftAdapterMixin:
alpha_keys = [k for k in network_alphas.keys() if k.startswith(f"{prefix}.")]
network_alphas = {k.replace(f"{prefix}.", ""): v for k, v in network_alphas.items() if k in alpha_keys}
+ import json
lora_config_kwargs = get_peft_kwargs(rank, network_alpha_dict=network_alphas, peft_state_dict=state_dict)
+ print(f"before adjustement: {json.dumps(lora_config_kwargs, indent=2)}")
# TODO: revisit this after https://github.com/huggingface/peft/pull/2382 is merged.
lora_config_kwargs = _maybe_adjust_config(lora_config_kwargs)
+ print(f"after adjustement: {json.dumps(lora_config_kwargs, indent=2)}")
if "use_dora" in lora_config_kwargs:
if lora_config_kwargs["use_dora"]:
@@ -769,182 +778,182 @@ class PeftAdapterMixin:
self.peft_config.pop(adapter_name, None)
-class ControlLoRAMixin(PeftAdapterMixin):
- TARGET_MODULES = ["to_q", "to_k", "to_v", "to_out.0", "ff.net.0.proj", "ff.net.2", "proj_in", "proj_out",
- "conv", "conv1", "conv2", "conv_in", "conv_shortcut", "linear_1", "linear_2", "time_emb_proj"]
- SAVE_MODULES = ["controlnet_cond_embedding.conv_in", "controlnet_cond_embedding.blocks.0",
- "controlnet_cond_embedding.blocks.1", "controlnet_cond_embedding.blocks.2",
- "controlnet_cond_embedding.blocks.3", "controlnet_cond_embedding.blocks.4",
- "controlnet_cond_embedding.blocks.5", "controlnet_cond_embedding.conv_out",
- "controlnet_down_blocks.0", "controlnet_down_blocks.1", "controlnet_down_blocks.2",
- "controlnet_down_blocks.3", "controlnet_down_blocks.4", "controlnet_down_blocks.5",
- "controlnet_down_blocks.6", "controlnet_down_blocks.7", "controlnet_down_blocks.8",
- "controlnet_mid_block", "norm", "norm1", "norm2", "norm3"]
-
- def load_lora_adapter(self, pretrained_model_name_or_path_or_dict, prefix="transformer", **kwargs):
- from peft import LoraConfig, inject_adapter_in_model, set_peft_model_state_dict
- from peft.tuners.tuners_utils import BaseTunerLayer
-
- cache_dir = kwargs.pop("cache_dir", None)
- force_download = kwargs.pop("force_download", False)
- proxies = kwargs.pop("proxies", None)
- local_files_only = kwargs.pop("local_files_only", None)
- token = kwargs.pop("token", None)
- revision = kwargs.pop("revision", None)
- subfolder = kwargs.pop("subfolder", None)
- weight_name = kwargs.pop("weight_name", None)
- use_safetensors = kwargs.pop("use_safetensors", None)
- adapter_name = kwargs.pop("adapter_name", None)
- network_alphas = kwargs.pop("network_alphas", None)
- _pipeline = kwargs.pop("_pipeline", None)
- low_cpu_mem_usage = kwargs.pop("low_cpu_mem_usage", False)
- allow_pickle = False
-
- if low_cpu_mem_usage and is_peft_version("<=", "0.13.0"):
- raise ValueError(
- "`low_cpu_mem_usage=True` is not compatible with this `peft` version. Please update it with `pip install -U peft`."
- )
-
- user_agent = {
- "file_type": "attn_procs_weights",
- "framework": "pytorch",
- }
-
- state_dict = _fetch_state_dict(
- pretrained_model_name_or_path_or_dict=pretrained_model_name_or_path_or_dict,
- weight_name=weight_name,
- use_safetensors=use_safetensors,
- local_files_only=local_files_only,
- cache_dir=cache_dir,
- force_download=force_download,
- proxies=proxies,
- token=token,
- revision=revision,
- subfolder=subfolder,
- user_agent=user_agent,
- allow_pickle=allow_pickle,
- )
- if network_alphas is not None and prefix is None:
- raise ValueError("`network_alphas` cannot be None when `prefix` is None.")
-
- if prefix is not None:
- keys = list(state_dict.keys())
- model_keys = [k for k in keys if k.startswith(f"{prefix}.")]
- if len(model_keys) > 0:
- state_dict = {k.replace(f"{prefix}.", ""): v for k, v in state_dict.items() if k in model_keys}
-
- if len(state_dict) > 0:
- if adapter_name in getattr(self, "peft_config", {}):
- raise ValueError(
- f"Adapter name {adapter_name} already in use in the model - please select a new adapter name."
- )
-
- # check with first key if is not in peft format
- if "lora_controlnet" in state_dict:
- del state_dict["lora_controlnet"]
- state_dict = convert_control_lora_state_dict_to_peft(state_dict)
-
- rank = {}
- for key, val in state_dict.items():
- # Cannot figure out rank from lora layers that don't have atleast 2 dimensions.
- # Bias layers in LoRA only have a single dimension
- if "lora_B" in key and val.ndim > 1:
- rank[key] = val.shape[1]
-
- if network_alphas is not None and len(network_alphas) >= 1:
- alpha_keys = [k for k in network_alphas.keys() if k.startswith(f"{prefix}.")]
- network_alphas = {k.replace(f"{prefix}.", ""): v for k, v in network_alphas.items() if k in alpha_keys}
-
- lora_config_kwargs = get_peft_kwargs(rank, network_alpha_dict=network_alphas, peft_state_dict=state_dict)
- lora_config_kwargs = _maybe_adjust_config(lora_config_kwargs)
-
- if "use_dora" in lora_config_kwargs:
- if lora_config_kwargs["use_dora"]:
- if is_peft_version("<", "0.9.0"):
- raise ValueError(
- "You need `peft` 0.9.0 at least to use DoRA-enabled LoRAs. Please upgrade your installation of `peft`."
- )
- else:
- if is_peft_version("<", "0.9.0"):
- lora_config_kwargs.pop("use_dora")
-
- if "lora_bias" in lora_config_kwargs:
- if lora_config_kwargs["lora_bias"]:
- if is_peft_version("<=", "0.13.2"):
- raise ValueError(
- "You need `peft` 0.14.0 at least to use `lora_bias` in LoRAs. Please upgrade your installation of `peft`."
- )
- else:
- if is_peft_version("<=", "0.13.2"):
- lora_config_kwargs.pop("lora_bias")
-
- lora_config_kwargs["bias"] = "all"
- lora_config_kwargs["target_modules"] = self.TARGET_MODULES
- lora_config_kwargs["modules_to_save"] = self.SAVE_MODULES
- lora_config = LoraConfig(**lora_config_kwargs)
- # adapter_name
- if adapter_name is None:
- adapter_name = "default"
-
- # <Unsafe code
- # We can be sure that the following works as it just sets attention processors, lora layers and puts all in the same dtype
- # Now we remove any existing hooks to `_pipeline`.
-
- # In case the pipeline has been already offloaded to CPU - temporarily remove the hooks
- # otherwise loading LoRA weights will lead to an error
- is_model_cpu_offload, is_sequential_cpu_offload = self._optionally_disable_offloading(_pipeline)
-
- peft_kwargs = {}
- if is_peft_version(">=", "0.13.1"):
- peft_kwargs["low_cpu_mem_usage"] = low_cpu_mem_usage
-
- # To handle scenarios where we cannot successfully set state dict. If it's unsucessful,
- # we should also delete the `peft_config` associated to the `adapter_name`.
- try:
- inject_adapter_in_model(lora_config, self, adapter_name=adapter_name, **peft_kwargs)
- incompatible_keys = set_peft_model_state_dict(self, state_dict, adapter_name, **peft_kwargs)
- except Exception as e:
- # In case `inject_adapter_in_model()` was unsuccessful even before injecting the `peft_config`.
- if hasattr(self, "peft_config"):
- for module in self.modules():
- if isinstance(module, BaseTunerLayer):
- active_adapters = module.active_adapters
- for active_adapter in active_adapters:
- if adapter_name in active_adapter:
- module.delete_adapter(adapter_name)
-
- self.peft_config.pop(adapter_name)
- logger.error(f"Loading {adapter_name} was unsucessful with the following error: \n{e}")
- raise
-
- warn_msg = ""
- if incompatible_keys is not None:
- # Check only for unexpected keys.
- unexpected_keys = getattr(incompatible_keys, "unexpected_keys", None)
- if unexpected_keys:
- lora_unexpected_keys = [k for k in unexpected_keys if "lora_" in k and adapter_name in k]
- if lora_unexpected_keys:
- warn_msg = (
- f"Loading adapter weights from state_dict led to unexpected keys found in the model:"
- f" {', '.join(lora_unexpected_keys)}. "
- )
-
- # Filter missing keys specific to the current adapter.
- missing_keys = getattr(incompatible_keys, "missing_keys", None)
- if missing_keys:
- lora_missing_keys = [k for k in missing_keys if "lora_" in k and adapter_name in k]
- if lora_missing_keys:
- warn_msg += (
- f"Loading adapter weights from state_dict led to missing keys in the model:"
- f" {', '.join(lora_missing_keys)}."
- )
-
- if warn_msg:
- logger.warning(warn_msg)
-
- # Offload back.
- if is_model_cpu_offload:
- _pipeline.enable_model_cpu_offload()
- elif is_sequential_cpu_offload:
- _pipeline.enable_sequential_cpu_offload()
- # Unsafe code />
+# class ControlLoRAMixin(PeftAdapterMixin):
+# TARGET_MODULES = ["to_q", "to_k", "to_v", "to_out.0", "ff.net.0.proj", "ff.net.2", "proj_in", "proj_out",
+# "conv", "conv1", "conv2", "conv_in", "conv_shortcut", "linear_1", "linear_2", "time_emb_proj"]
+# SAVE_MODULES = ["controlnet_cond_embedding.conv_in", "controlnet_cond_embedding.blocks.0",
+# "controlnet_cond_embedding.blocks.1", "controlnet_cond_embedding.blocks.2",
+# "controlnet_cond_embedding.blocks.3", "controlnet_cond_embedding.blocks.4",
+# "controlnet_cond_embedding.blocks.5", "controlnet_cond_embedding.conv_out",
+# "controlnet_down_blocks.0", "controlnet_down_blocks.1", "controlnet_down_blocks.2",
+# "controlnet_down_blocks.3", "controlnet_down_blocks.4", "controlnet_down_blocks.5",
+# "controlnet_down_blocks.6", "controlnet_down_blocks.7", "controlnet_down_blocks.8",
+# "controlnet_mid_block", "norm", "norm1", "norm2", "norm3"]
+
+# def load_lora_adapter(self, pretrained_model_name_or_path_or_dict, prefix="transformer", **kwargs):
+# from peft import LoraConfig, inject_adapter_in_model, set_peft_model_state_dict
+# from peft.tuners.tuners_utils import BaseTunerLayer
+
+# cache_dir = kwargs.pop("cache_dir", None)
+# force_download = kwargs.pop("force_download", False)
+# proxies = kwargs.pop("proxies", None)
+# local_files_only = kwargs.pop("local_files_only", None)
+# token = kwargs.pop("token", None)
+# revision = kwargs.pop("revision", None)
+# subfolder = kwargs.pop("subfolder", None)
+# weight_name = kwargs.pop("weight_name", None)
+# use_safetensors = kwargs.pop("use_safetensors", None)
+# adapter_name = kwargs.pop("adapter_name", None)
+# network_alphas = kwargs.pop("network_alphas", None)
+# _pipeline = kwargs.pop("_pipeline", None)
+# low_cpu_mem_usage = kwargs.pop("low_cpu_mem_usage", False)
+# allow_pickle = False
+
+# if low_cpu_mem_usage and is_peft_version("<=", "0.13.0"):
+# raise ValueError(
+# "`low_cpu_mem_usage=True` is not compatible with this `peft` version. Please update it with `pip install -U peft`."
+# )
+
+# user_agent = {
+# "file_type": "attn_procs_weights",
+# "framework": "pytorch",
+# }
+
+# state_dict = _fetch_state_dict(
+# pretrained_model_name_or_path_or_dict=pretrained_model_name_or_path_or_dict,
+# weight_name=weight_name,
+# use_safetensors=use_safetensors,
+# local_files_only=local_files_only,
+# cache_dir=cache_dir,
+# force_download=force_download,
+# proxies=proxies,
+# token=token,
+# revision=revision,
+# subfolder=subfolder,
+# user_agent=user_agent,
+# allow_pickle=allow_pickle,
+# )
+# if network_alphas is not None and prefix is None:
+# raise ValueError("`network_alphas` cannot be None when `prefix` is None.")
+
+# if prefix is not None:
+# keys = list(state_dict.keys())
+# model_keys = [k for k in keys if k.startswith(f"{prefix}.")]
+# if len(model_keys) > 0:
+# state_dict = {k.replace(f"{prefix}.", ""): v for k, v in state_dict.items() if k in model_keys}
+
+# if len(state_dict) > 0:
+# if adapter_name in getattr(self, "peft_config", {}):
+# raise ValueError(
+# f"Adapter name {adapter_name} already in use in the model - please select a new adapter name."
+# )
+
+# # check with first key if is not in peft format
+# if "lora_controlnet" in state_dict:
+# del state_dict["lora_controlnet"]
+# state_dict = convert_control_lora_state_dict_to_peft(state_dict)
+
+# rank = {}
+# for key, val in state_dict.items():
+# # Cannot figure out rank from lora layers that don't have atleast 2 dimensions.
+# # Bias layers in LoRA only have a single dimension
+# if "lora_B" in key and val.ndim > 1:
+# rank[key] = val.shape[1]
+
+# if network_alphas is not None and len(network_alphas) >= 1:
+# alpha_keys = [k for k in network_alphas.keys() if k.startswith(f"{prefix}.")]
+# network_alphas = {k.replace(f"{prefix}.", ""): v for k, v in network_alphas.items() if k in alpha_keys}
+
+# lora_config_kwargs = get_peft_kwargs(rank, network_alpha_dict=network_alphas, peft_state_dict=state_dict)
+# lora_config_kwargs = _maybe_adjust_config(lora_config_kwargs)
+
+# if "use_dora" in lora_config_kwargs:
+# if lora_config_kwargs["use_dora"]:
+# if is_peft_version("<", "0.9.0"):
+# raise ValueError(
+# "You need `peft` 0.9.0 at least to use DoRA-enabled LoRAs. Please upgrade your installation of `peft`."
+# )
+# else:
+# if is_peft_version("<", "0.9.0"):
+# lora_config_kwargs.pop("use_dora")
+
+# if "lora_bias" in lora_config_kwargs:
+# if lora_config_kwargs["lora_bias"]:
+# if is_peft_version("<=", "0.13.2"):
+# raise ValueError(
+# "You need `peft` 0.14.0 at least to use `lora_bias` in LoRAs. Please upgrade your installation of `peft`."
+# )
+# else:
+# if is_peft_version("<=", "0.13.2"):
+# lora_config_kwargs.pop("lora_bias")
+
+# lora_config_kwargs["bias"] = "all"
+# lora_config_kwargs["target_modules"] = self.TARGET_MODULES
+# lora_config_kwargs["modules_to_save"] = self.SAVE_MODULES
+# lora_config = LoraConfig(**lora_config_kwargs)
+# # adapter_name
+# if adapter_name is None:
+# adapter_name = "default"
+
+# # <Unsafe code
+# # We can be sure that the following works as it just sets attention processors, lora layers and puts all in the same dtype
+# # Now we remove any existing hooks to `_pipeline`.
+
+# # In case the pipeline has been already offloaded to CPU - temporarily remove the hooks
+# # otherwise loading LoRA weights will lead to an error
+# is_model_cpu_offload, is_sequential_cpu_offload = self._optionally_disable_offloading(_pipeline)
+
+# peft_kwargs = {}
+# if is_peft_version(">=", "0.13.1"):
+# peft_kwargs["low_cpu_mem_usage"] = low_cpu_mem_usage
+
+# # To handle scenarios where we cannot successfully set state dict. If it's unsucessful,
+# # we should also delete the `peft_config` associated to the `adapter_name`.
+# try:
+# inject_adapter_in_model(lora_config, self, adapter_name=adapter_name, **peft_kwargs)
+# incompatible_keys = set_peft_model_state_dict(self, state_dict, adapter_name, **peft_kwargs)
+# except Exception as e:
+# # In case `inject_adapter_in_model()` was unsuccessful even before injecting the `peft_config`.
+# if hasattr(self, "peft_config"):
+# for module in self.modules():
+# if isinstance(module, BaseTunerLayer):
+# active_adapters = module.active_adapters
+# for active_adapter in active_adapters:
+# if adapter_name in active_adapter:
+# module.delete_adapter(adapter_name)
+
+# self.peft_config.pop(adapter_name)
+# logger.error(f"Loading {adapter_name} was unsucessful with the following error: \n{e}")
+# raise
+
+# warn_msg = ""
+# if incompatible_keys is not None:
+# # Check only for unexpected keys.
+# unexpected_keys = getattr(incompatible_keys, "unexpected_keys", None)
+# if unexpected_keys:
+# lora_unexpected_keys = [k for k in unexpected_keys if "lora_" in k and adapter_name in k]
+# if lora_unexpected_keys:
+# warn_msg = (
+# f"Loading adapter weights from state_dict led to unexpected keys found in the model:"
+# f" {', '.join(lora_unexpected_keys)}. "
+# )
+
+# # Filter missing keys specific to the current adapter.
+# missing_keys = getattr(incompatible_keys, "missing_keys", None)
+# if missing_keys:
+# lora_missing_keys = [k for k in missing_keys if "lora_" in k and adapter_name in k]
+# if lora_missing_keys:
+# warn_msg += (
+# f"Loading adapter weights from state_dict led to missing keys in the model:"
+# f" {', '.join(lora_missing_keys)}."
+# )
+
+# if warn_msg:
+# logger.warning(warn_msg)
+
+# # Offload back.
+# if is_model_cpu_offload:
+# _pipeline.enable_model_cpu_offload()
+# elif is_sequential_cpu_offload:
+# _pipeline.enable_sequential_cpu_offload()
+# # Unsafe code />
diff --git a/src/diffusers/models/controlnets/controlnet.py b/src/diffusers/models/controlnets/controlnet.py
index c1404c48c..e49556c03 100644
--- a/src/diffusers/models/controlnets/controlnet.py
+++ b/src/diffusers/models/controlnets/controlnet.py
@@ -19,7 +19,7 @@ from torch import nn
from torch.nn import functional as F
from ...configuration_utils import ConfigMixin, register_to_config
-from ...loaders import PeftAdapterMixin, ControlLoRAMixin
+from ...loaders import PeftAdapterMixin
from ...loaders.single_file_model import FromOriginalModelMixin
from ...utils import BaseOutput, logging
from ..attention_processor import (
@@ -107,7 +107,7 @@ class ControlNetConditioningEmbedding(nn.Module):
return embedding
-class ControlNetModel(ModelMixin, ConfigMixin, FromOriginalModelMixin, ControlLoRAMixin):
+class ControlNetModel(ModelMixin, ConfigMixin, FromOriginalModelMixin, PeftAdapterMixin):
"""
A ControlNet model.
diff --git a/src/diffusers/utils/state_dict_utils.py b/src/diffusers/utils/state_dict_utils.py
index 322b118a6..2ff535c77 100644
--- a/src/diffusers/utils/state_dict_utils.py
+++ b/src/diffusers/utils/state_dict_utils.py
@@ -449,8 +449,7 @@ def convert_control_lora_state_dict_to_peft(state_dict):
return converted_state_dict
state_dict = _convert_controlnet_to_diffusers(state_dict)
- mapping = CONTROL_LORA_TO_DIFFUSERS
- return convert_state_dict(state_dict, mapping)
+ return convert_state_dict(state_dict, CONTROL_LORA_TO_DIFFUSERS)
def convert_all_state_dict_to_peft(state_dict):
When I executed this code: Codefrom diffusers import (
StableDiffusionXLControlNetPipeline,
ControlNetModel,
UNet2DConditionModel,
)
import torch
from diffusers import AutoencoderKL
from diffusers.utils import load_image, make_image_grid
from PIL import Image
import numpy as np
import cv2
pipe_id = "stabilityai/stable-diffusion-xl-base-1.0"
lora_id = "stabilityai/control-lora"
lora_filename = "control-LoRAs-rank128/control-lora-canny-rank128.safetensors"
unet = UNet2DConditionModel.from_pretrained(pipe_id, subfolder="unet", torch_dtype=torch.bfloat16).to("cuda")
controlnet = ControlNetModel.from_unet(unet).to(device="cuda", dtype=torch.bfloat16)
# controlnet.load_lora_adapter(lora_id, weight_name=lora_filename, controlnet_config=controlnet.config)
controlnet.load_lora_adapter(lora_id, weight_name=lora_filename, prefix=None)
prompt = "aerial view, a futuristic research complex in a bright foggy jungle, hard lighting"
negative_prompt = "low quality, bad quality, sketches"
image = load_image("https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png")
controlnet_conditioning_scale = 1.0 # recommended for good generalization
vae = AutoencoderKL.from_pretrained("stabilityai/sdxl-vae", torch_dtype=torch.bfloat16)
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
pipe_id,
unet=unet,
controlnet=controlnet,
vae=vae,
torch_dtype=torch.bfloat16,
).to("cuda")
image = np.array(image)
image = cv2.Canny(image, 100, 200)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
image = Image.fromarray(image)
images = pipe(
prompt, negative_prompt=negative_prompt, image=image,
controlnet_conditioning_scale=controlnet_conditioning_scale,
num_images_per_prompt=4
).images
final_image = [image] + images
grid = make_image_grid(final_image, 1, 5)
grid.save(f"hf-logo.png") I got the following: Loading adapter weights from state_dict led to missing keys in the model: controlnet_cond_embedding.conv_in.lora_A.default_0.weight, controlnet_cond_embedding.conv_in.lora_B.default_0.weight. Some further comments:
The changes reflect how we would like the changes be. Could you look into this further? |
|
If there's a problem with
Okay. How are we injecting this fully fine-tuned layer into the base |
I believe a reasonable approach is to fix the issues with |
Yeah exactly what I am suggesting. I would encourage you to take a crack at that (as you have already done quite a bit of work) and we can take it from there. |
@sayakpaul The |
Thanks for your hard work. Let's maybe wait a bit for: #10985? |
This PR is a continuation of the following discussion #4679 #4899, and it addresses the following issues:
This code is only an initial version and contains many makeshift solutions as well as several issues. Currently, these are the observations I have made: