Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

comfyui gguf model load for diffusers #200

Open
zhangn77 opened this issue Jan 13, 2025 · 1 comment
Open

comfyui gguf model load for diffusers #200

zhangn77 opened this issue Jan 13, 2025 · 1 comment

Comments

@zhangn77
Copy link

for some reasons, I need to run diffusers flux pipeline in comfyui, which means I need to create somes node for flux vae/transformer/text encoder... and make a sampler node to generate the image.
I can achieve this pipeline for a normal flux model. However, when I change the transformer to a gguf-version one, I got this error:
RuntimeError: mat1 and mat2 must have the same dtype, but got BFloat16 and Float
which seems like the gguf model is not processed properly. How can I use it in the pipeline? Below is the code:
`def load_flux_gguf(file_path, transformer_config, dtype, device):
transformer = None
from accelerate import init_empty_weights
from diffusers.loaders.single_file_utils import convert_flux_transformer_checkpoint_to_diffusers
with init_empty_weights():
from diffusers.models.transformers.transformer_flux import FluxTransformer2DModel_Omini_sep_4_cat
config = FluxTransformer2DModel_Omini_sep_4_cat.load_config(transformer_config)
transformer = FluxTransformer2DModel_Omini_sep_4_cat.from_config(config).to(dtype)
expected_state_dict_keys = list(transformer.state_dict().keys())
state_dict, stats = ggml.load_gguf_state_dict(file_path, dtype)
applied, skipped = 0, 0
for param_name, param in state_dict.items():
if param_name not in expected_state_dict_keys:
skipped += 1
continue
applied += 1
hijack_set_module_tensor_simple(transformer, tensor_name=param_name, value=param, device=device)
state_dict[param_name] = None
return transformer, None

def hijack_set_module_tensor_simple(module,tensor_name,device,value):
if "." in tensor_name:
splits = tensor_name.split(".")
for split in splits[:-1]:
module = getattr(module, split)
tensor_name = splits[-1]
old_value = getattr(module, tensor_name)
with torch.no_grad():
if tensor_name in module._buffers:
module._buffers[tensor_name] = value.to(device, non_blocking=True)
elif value is not None:
param_cls = type(module._parameters[tensor_name])
module.parameters[tensor_name] = param_cls(value, requires_grad=False).to(device, non_blocking=True)
unet_path = 'flux1-q4_0.gguf'
transformer_config = 'flux/transformer'
dtype = torch.bfloat16
device = 'cuda'
gguf_transformer,
= load_flux_gguf(unet_path, transformer_config, dtype, device)
pipeline = FluxPipeline.from_pretrained(
flux_dir,
transformer=gguf_transformer,
text_encoder=None,
text_encoder_2=None,
tokenizer=None,
tokenizer_2=None,
torch_dtype=torch.bfloat16,
)
`

@al-swaiti
Copy link

al-swaiti commented Feb 11, 2025

the diffusers start to support llama.cpp (which responsible about gguf quantization) for flux model early , to load gguf models u have to use this code

`import torch

from diffusers import FluxPipeline, FluxTransformer2DModel, GGUFQuantizationConfig

ckpt_path = (
    "https://huggingface.co/city96/FLUX.1-dev-gguf/blob/main/flux1-dev-Q2_K.gguf"
)
transformer = FluxTransformer2DModel.from_single_file(
    ckpt_path,
    quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
    torch_dtype=torch.bfloat16,
)
pipe = FluxPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-dev",
    transformer=transformer,
    torch_dtype=torch.bfloat16,
)
pipe.enable_model_cpu_offload()
prompt = "A cat holding a sign that says hello world"
image = pipe(prompt, generator=torch.manual_seed(0)).images[0]
image.save("flux-gguf.png")`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants