New model: Anima by dxqb · Pull Request #1487 · Nerogar/OneTrainer

dxqb · 2026-05-30T08:54:00Z

Test in preview branch: https://github.com/Nerogar/OneTrainer/tree/preview

Includes:

Upgrade torch #1392 — Upgrade torch - needed because of Anima uses the Qwen VAE: [Bug]: Qwen Image sampling VRAM regression #1371
Upgrade transformers to 5.x and other dependencies #1285 — Upgrade transformers to 5.x: needed by diffusers now

- Bump requirements: transformers 4.57.6 → 5.9, huggingface-hub 0.34.4 → 1.16.1 - Remove HF_HUB_DISABLE_XET workaround from startup scripts; Xet is stable in hub 1.16 - Remove _prepare_sub_modules / snapshot_download prefetching; hub 1.16 fetches lazily on demand - Delete thread_safety.py and apply_thread_safe_forward calls; workaround for transformers#42673 was fixed upstream in v5 - Replace _remove_added_embeddings_from_tokenizer (relied on internal Trie, removed in v5) with orig_tokenizer deep-copies stored at load time; model savers pass use_original_tokenizers=True to create_pipeline() so saved checkpoints use the unmodified tokenizer - Switch ErnieModelLoader to AutoTokenizer; eliminates the tokenization-logger suppress workaround - Suppress httpx INFO logs; hub 1.16 uses httpx internally and logs every HTTP request Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: dxqb <183307934+dxqb@users.noreply.github.com>

AmaelG · 2026-06-01T12:34:34Z

Unsure if this is in scope for this PR, but I think it would be useful to expose an optional toggle to train Anima's llm_adapter.
From my testing with multi-concept training, training the llm adapter seems to improve concept adherence, converge faster, and reach lower loss/val.
I know tdrussel recommends to avoid training it, but in my experiments I have not seen obvious degradation of general knowledge, while the trained concepts became more reliable.

dxqb · 2026-06-02T19:53:33Z

Unsure if this is in scope for this PR, but I think it would be useful to expose an optional toggle to train Anima's llm_adapter. From my testing with multi-concept training, training the llm adapter seems to improve concept adherence, converge faster, and reach lower loss/val. I know tdrussel recommends to avoid training it, but in my experiments I have not seen obvious degradation of general knowledge, while the trained concepts became more reliable.

training text components should be a thing of the past. it's always been a crutch for diffusion models that weren't very capable yet. So I'm hesitant to reintroduce this, with all the problems that come with it (such as having multiple learning rates and many more failure modes).
If there is strong community support that this is needed, maybe, but if even the model's creator advises against it...

torch._dynamo.config overrides are thread-local. The existing call in checkpointing_util runs in the main thread and is invisible to the training thread spawned by the UI. This caused compiled optimizers (e.g. AdamW_adv with compiled_optimizer=True) to hit the default recompile_limit of 8 and abort with FailOnRecompileLimitHit when training models with more than 8 distinct parameter shapes. Fix: call init_compile() from GenericTrainer.__init__, which runs in whichever thread/process owns training (UI thread, CLI main thread, or torch.multiprocessing.spawn subprocess for multi-GPU). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

dxqb · 2026-06-04T18:35:02Z

torch._dynamo.exc.FailOnRecompileLimitHit: Hard failure due to fullgraph=True

fixed by #1495, merged into this PR

FuouM · 2026-06-05T08:59:28Z

Thank you for your great work!

I've been training Anima LoRAs in OneTrainer and using the same dataset/settings in Kohya sd-scripts. Training itself works fine, but LoRAs saved from OneTrainer (preview branch) don't load correctly in ComfyUI when paired with the standard single-checkpoint Anima model (anima-base-v1.0.safetensors). I believe this is due to the LoRA key naming on export.

Example key:

sd-scripts: lora_unet_blocks_0_self_attn_q_proj.lora_down.weight
OneTrainer: transformer.transformer_blocks.0.attn1.to_q.lora_down.weight

Other model types in OneTrainer already handle this via convert_*_lora.py key sets (e.g. Flux, HiDream, SD3). Anima's AnimaLoRASaver and AnimaLoRALoader both return None from _get_convert_key_sets(), so no conversion runs on save or load.

Diffusers has the inverse mapping in _convert_non_diffusers_anima_lora_to_diffusers() (lora_conversion_utils.py), which lines up with the rename table already documented in AnimaModel.py (diffusers_to_original()).

I prototyped the conversion script as below. It might be missing things as I haven't tested exhaustively yet:

# convert_anima_lora.py
from modules.util.convert.lora.convert_lora_util import LoraConversionKeySet


def __map_anima_blocks(parent: LoraConversionKeySet) -> list[LoraConversionKeySet]:
    return [LoraConversionKeySet(
        omi_prefix=f"blocks.{i}",
        diffusers_prefix=f"transformer_blocks.{i}",
        legacy_diffusers_prefix=f"blocks_{i}",
        parent=parent,
        next_omi_prefix=f"blocks.{i + 1}",
        next_diffusers_prefix=f"transformer_blocks.{i + 1}",
    ) for i in range(100)]


def __map_transformer_block(key_prefix: LoraConversionKeySet) -> list[LoraConversionKeySet]:
    mappings = [
        ("self_attn.q_proj", "attn1.to_q", "self_attn_q_proj"),
        ("self_attn.k_proj", "attn1.to_k", "self_attn_k_proj"),
        ("self_attn.v_proj", "attn1.to_v", "self_attn_v_proj"),
        ("self_attn.output_proj", "attn1.to_out.0", "self_attn_output_proj"),
        ("cross_attn.q_proj", "attn2.to_q", "cross_attn_q_proj"),
        ("cross_attn.k_proj", "attn2.to_k", "cross_attn_k_proj"),
        ("cross_attn.v_proj", "attn2.to_v", "cross_attn_v_proj"),
        ("cross_attn.output_proj", "attn2.to_out.0", "cross_attn_output_proj"),
        ("mlp.layer1", "ff.net.0.proj", "mlp_layer1"),
        ("mlp.layer2", "ff.net.2", "mlp_layer2"),
    ]

    return [
        LoraConversionKeySet(omi, diffusers, legacy_diffusers_prefix=legacy, parent=key_prefix)
        for omi, diffusers, legacy in mappings
    ]


def convert_anima_lora_key_sets() -> list[LoraConversionKeySet]:
    keys = []

    transformer = LoraConversionKeySet(
        "lora_unet",
        "transformer",
        legacy_diffusers_prefix="lora_unet",
    )

    for block_prefix in __map_anima_blocks(transformer):
        keys += __map_transformer_block(block_prefix)

    return keys

After converting, the output in ComfyUI seems to be affected by the LoRA as expected.

dxqb · 2026-06-05T17:19:38Z

I've been training Anima LoRAs in OneTrainer and using the same dataset/settings in Kohya sd-scripts. Training itself works fine, but LoRAs saved from OneTrainer (preview branch) don't load correctly in ComfyUI when paired with the standard single-checkpoint Anima model (anima-base-v1.0.safetensors). I believe this is due to the LoRA key naming on export.

Comfy-Org/ComfyUI#14182

Silvicultor · 2026-06-05T19:14:39Z

I've been training Anima LoRAs in OneTrainer and using the same dataset/settings in Kohya sd-scripts. Training itself works fine, but LoRAs saved from OneTrainer (preview branch) don't load correctly in ComfyUI when paired with the standard single-checkpoint Anima model (anima-base-v1.0.safetensors). I believe this is due to the LoRA key naming on export.

Comfy-Org/ComfyUI#14182

Doesn't look like Comfyanon wants to merge this one and also keep in mind that other inference tools (at least the ones that aren't built upon Diffusers) would also have to make the same change to their code. So I say OneTrainer should include the above proposed conversion logic into it's code and settle for the de-facto standard already established. I know OT wants to use Diffusers keys whenever possible for consistency, and that's perfectly fine for all the models like Flux or Qwen, their original repos being Diffusers format, but this is a special case. Initial Anima release wasn't in Diffusers format, so it's hard to argue for the Diffusers keys.

dxqb · 2026-06-05T19:20:00Z

I've been training Anima LoRAs in OneTrainer and using the same dataset/settings in Kohya sd-scripts. Training itself works fine, but LoRAs saved from OneTrainer (preview branch) don't load correctly in ComfyUI when paired with the standard single-checkpoint Anima model (anima-base-v1.0.safetensors). I believe this is due to the LoRA key naming on export.

Comfy-Org/ComfyUI#14182

Doesn't look like Comfyanon wants to merge this one

if that is the case, they should close the PR. As for the other points, we already had this discussion on Discord.

dxqb · 2026-06-05T19:23:20Z

By the way, this PR already includes conversion code: https://github.com/dxqb/OneTrainer/blob/03b7156c49bac5c8f5a8f13de259357f94047d75/modules/model/AnimaModel.py#L31

It's just not used for LoRAs currently (only for full finetunes), because this was the consistent and accepted way to do things for all other models. If inference tools want to change that now, they should make that clear (by closing the PR, for example)

# Conflicts: # requirements-rocm.txt

# Conflicts: # modules/util/compile_util.py

Mirrors upstream commit 75a44d2, which converted the rest of the codebase from the trailing factory.register() call to the @factory.register decorator form.

# Conflicts: # modules/modelLoader/mixin/HFModelLoaderMixin.py

# Conflicts: # modules/modelLoader/mixin/HFModelLoaderMixin.py # requirements-global.txt

Audit fixes applied during merge: - ModelType.py: register ANIMA in _MODEL_PARTS and supported_training_methods() - AnimaModel.py: add missing release() abstract method - BaseAnimaSetup.py: per-component checkpointing, 3/4-arg autocast helpers, release()-based prepare_text_caching - Anima{FineTune,LoRA}Setup.py: latent_caching -> image_caching/text_caching - BaseModelTabView/BaseTrainingTabView/TopBarController/BaseConvertModelUIView: wire up Anima UI - test/run_lora_presets.sh: add Anima LoRA preset

Silvicultor · 2026-06-21T11:01:22Z

Did a lot of testing with the current version of the PR in the last 2 weeks. Overall Anima runs very good and stable in OT.
What I tested:
-Normal LoRA training works
-Anima LoKr works
-Masked training works fine aside from normalizing mask area loss can cause NaNs sometimes
-Torch compile, transformer blocks + optimizer, fully functional after workaround code was added
-Output quality of the LoRAs is similar to what other training tools produce (e. g. SD scripts).

So hoping to see this in the master branch soon! The only thing from my perspective that is left to overcome is the LoRA key issue, but I know it’s being discussed right now. Until then I convert manually to Comfy format.

dxqb and others added 9 commits March 25, 2026 00:39

torch 2.11

0c869b8

Merge branch 'master' into torch_2_11

5a8745c

Upgrade torch to 2.12.0 (CUDA 13.0 / ROCm 7.2)

6913fc6

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Switch nccl pin to cu13 (matches torch 2.12+cu130 dependency)

6e4c2e5

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Update triton-windows to 3.7.0.post26 (matches triton 3.7.0)

b3c6462

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Merge branch 'transformers5' into anima_base

8c10d1c

Merge branch 'upstream' into anima_base

477f940

anima: Anima model support (LoRA + Fine-Tune)

ea59c93

Co-Authored-By: dxqb <183307934+dxqb@users.noreply.github.com>

dxqb force-pushed the anima branch from 9b933bb to ea59c93 Compare May 30, 2026 08:55

dxqb added the preview merged in the preview branch label May 30, 2026

This comment was marked as resolved.

Sign in to view

dxqb added a commit to TheForgotten69/OneTrainer that referenced this pull request Jun 3, 2026

Merge PR Nerogar#1487 (Anima) into preview

3fc99bd

dxqb and others added 2 commits June 4, 2026 20:28

Merge branch 'fix/compiled-optimizer-thread-local' into anima

67169e8

dxqb mentioned this pull request Jun 4, 2026

[Feat]: Anima support #1278

Open

dxqb linked an issue Jun 4, 2026 that may be closed by this pull request

[Feat]: Anima support #1278

Open

Merge branch 'master' into anima

03b7156

dxqb added a commit that referenced this pull request Jun 4, 2026

Merge PR #1487 (Anima) into preview

41024fb

dxqb changed the title ~~Anima~~ New model: Anima Jun 6, 2026

dxqb added a commit that referenced this pull request Jun 14, 2026

Merge PR #1487 (New model: Anima) into preview

af0f4b5

Merge remote-tracking branch 'Nerogar/master' into anima_base

2cf5e71

# Conflicts: # requirements-rocm.txt

dxqb added 7 commits June 18, 2026 01:27

Merge branch 'anima_base' into anima

7f3af9e

# Conflicts: # modules/util/compile_util.py

Use decorator form for factory.register() in Anima model files

f51551d

Mirrors upstream commit 75a44d2, which converted the rest of the codebase from the trailing factory.register() call to the @factory.register decorator form.

Merge branch 'anima' of origin into anima

2a133b4

# Conflicts: # modules/modelLoader/mixin/HFModelLoaderMixin.py

Merge branch 'master' into anima

2ad36a8

Merge remote-tracking branch 'Nerogar/master' into anima_base

8dc10b2

# Conflicts: # modules/modelLoader/mixin/HFModelLoaderMixin.py # requirements-global.txt

Merge branch 'anima_base' into anima

4fe6158

Merge remote-tracking branch 'origin/anima' into anima

3e84d01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New model: Anima#1487

New model: Anima#1487
dxqb wants to merge 20 commits into
Nerogar:masterfrom
dxqb:anima

dxqb commented May 30, 2026 •

edited

Loading

Uh oh!

AmaelG commented Jun 1, 2026

Uh oh!

This comment was marked as resolved.

dxqb commented Jun 2, 2026 •

edited

Loading

Uh oh!

dxqb commented Jun 4, 2026

Uh oh!

FuouM commented Jun 5, 2026

Uh oh!

dxqb commented Jun 5, 2026

Uh oh!

Silvicultor commented Jun 5, 2026

Uh oh!

dxqb commented Jun 5, 2026

Uh oh!

dxqb commented Jun 5, 2026 •

edited

Loading

Uh oh!

Silvicultor commented Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

dxqb commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AmaelG commented Jun 1, 2026

Uh oh!

This comment was marked as resolved.

dxqb commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dxqb commented Jun 4, 2026

Uh oh!

FuouM commented Jun 5, 2026

Uh oh!

dxqb commented Jun 5, 2026

Uh oh!

Silvicultor commented Jun 5, 2026

Uh oh!

dxqb commented Jun 5, 2026

Uh oh!

dxqb commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Silvicultor commented Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dxqb commented May 30, 2026 •

edited

Loading

dxqb commented Jun 2, 2026 •

edited

Loading

dxqb commented Jun 5, 2026 •

edited

Loading