Skip to content

Fix Kohya UNet LoRA key conversion for conv_in/conv_out/time_embedding#14006

Merged
sayakpaul merged 4 commits into
huggingface:mainfrom
dxqb:fix-unet-lora-conv-time-embedding
Jun 28, 2026
Merged

Fix Kohya UNet LoRA key conversion for conv_in/conv_out/time_embedding#14006
sayakpaul merged 4 commits into
huggingface:mainfrom
dxqb:fix-unet-lora-conv-time-embedding

Conversation

@dxqb

@dxqb dxqb commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Fixes #14005.

It is in principle related to #14080, as it addresses layers that aren't trained that often, but can be trained.

What does this PR do?

Kohya-format UNet LoRA keys for several top-level UNet submodules weren't being
converted to diffusers names, so they didn't match any parameter and were reported as
unexpected keys instead of being applied. This covers both UNet dialects kohya-ss
trains on: the diffusers UNet (SD 1.x) and the sgm/LDM UNet (SDXL).

_convert_unet_lora_key — add the missing name patches:

  • conv.in → conv_in, conv.out → conv_out
  • sgm input.blocks.0.0 → conv_in, out.2 → conv_out (mapped before the block renames
    so input_blocks.0.0 isn't mistaken for a down-block)
  • time.embed.0/.2 → time_embedding.linear_1/2 (sgm) and
    time.embedding.linear.1/2 → time_embedding.linear_1/2 (diffusers)
  • label.emb.0.0/0.2 → add_embedding.linear_1/2 (SDXL added-conditioning MLP)

_maybe_map_sgm_blocks_to_diffusers — pass top-level sgm modules
(time_embed, label_emb, out, input_blocks.0.0) through unchanged so the key
converter handles them, instead of block-remapping or hitting the "layer not supported"
raise.

Who can review?

PEFT: @sayakpaul @BenjaminBossan

_convert_unet_lora_key() had no mapping for these three top-level UNet
submodules, so Kohya-format keys touching them (e.g. lora_unet_conv_in,
lora_unet_time_embed_0/2) came out as conv.in/conv.out/time.embed.0/2
instead of conv_in/conv_out/time_embedding.linear_1/2, and were
reported as unexpected keys instead of being applied.
@github-actions github-actions Bot added lora size/S PR with diff < 50 LOC fixes-issue labels Jun 20, 2026
@dxqb

dxqb commented Jun 20, 2026

Copy link
Copy Markdown
Contributor Author

needs a test, I'll mark the PR as ready as soon as that passes

dxqb and others added 2 commits June 26, 2026 15:10
The initial fix mapped conv_in/conv_out in the diffusers spelling (conv.in/
conv.out) and time_embedding in the sgm spelling (time_embed.0/.2), so neither
SD1.x nor SDXL was fully covered. Add the missing spellings:

- sgm conv_in/conv_out: input_blocks.0.0 / out.2 (kohya-ss SDXL sgm UNet),
  mapped before the block renames so input_blocks.0.0 does not become
  down_blocks.0.0.
- diffusers time_embedding: time_embedding.linear_1/2 (kohya-ss trains SD1.x on
  the diffusers UNet).

Verified against kohya-ss source (sdxl_original_unet.py, networks/lora.py) and
the diffusers UNet module names; regression set unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The conv_in/conv_out/time_embedding fix only reached _convert_unet_lora_key;
for the SDXL sgm UNet those keys never got there, because
_maybe_map_sgm_blocks_to_diffusers treats every non-text key as a down/mid/up
block. The top-level modules that live outside that block structure
(time_embed, label_emb, out = conv_out, and input_blocks.0.0 = conv_in) hit the
"layer not supported" raise, or crashed the inner block-index int() parse.

- Pass those top-level modules through unchanged so _convert_unet_lora_key maps
  them, instead of block-remapping or raising.
- Map the sgm label_emb (SDXL added-conditioning MLP) to diffusers add_embedding:
  label_emb.0.0/0.2 -> add_embedding.linear_1/2, before the SDXL index-strip
  heuristic that would otherwise collapse the layer index.

All additions follow the kohya/sgm naming pattern and are no-ops on real
kohya-ss files (which contain none of these top-level UNet LoRA keys); verified
end-to-end loading a full SDXL sgm UNet LoRA into the diffusers pipeline with no
unexpected/missing adapter keys.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@dxqb dxqb marked this pull request as ready for review June 27, 2026 10:36
@dxqb

dxqb commented Jun 27, 2026

Copy link
Copy Markdown
Contributor Author

this PR is ready for review now

@sayakpaul sayakpaul left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the massive contribution!

@HuggingFaceDocBuilderDev

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@sayakpaul sayakpaul merged commit ea80295 into huggingface:main Jun 28, 2026
13 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fixes-issue lora size/S PR with diff < 50 LOC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

_convert_unet_lora_key misses conv_in/conv_out/time_embedding mapping for Kohya UNet LoRAs

3 participants