Skip to content

New Model: Ideogram 4#1522

Draft
dxqb wants to merge 20 commits into
Nerogar:masterfrom
dxqb:ideogram
Draft

New Model: Ideogram 4#1522
dxqb wants to merge 20 commits into
Nerogar:masterfrom
dxqb:ideogram

Conversation

@dxqb

@dxqb dxqb commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator

Test in preview branch: https://github.com/Nerogar/OneTrainer/tree/preview

Summary

Summary:

  • Adds support for Ideogram 4, including model loading/saving, data loading, sampling, and training setups for both LoRA and Fine Tune methods.
  • Unconditional transformer can be optionally loaded for sampling
  • Dynamic timestep shifting is according to the ideogram shifting schedule (differs from Flux)

Note:

Test plan

  • pre-commit run --all-files passes
  • Launched the affected UI or script and exercised the change
  • Tested with at least one real preset / config when relevant (note which: Ideogram both)

AI assistance

  • AI-assisted — I have read every line in this diff and can defend each change

dxqb and others added 12 commits May 25, 2026 18:11
…model composition in ModelType

- Gradient checkpointing and layer offloading are now configured per component
  (text encoder, transformer, VAE) rather than globally
- ModelType centralizes model composition and training method associations

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Several model savers (Ernie, Flux2, Z-Image, ...) duplicate the same
deepcopy + tokenizer __deepcopy__ workaround to produce a dtype-converted
copy of a diffusers pipeline for saving. Extract it into a shared helper
so new savers can reuse it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a tokenizer_attrs parameter (default ("tokenizer",)) so savers with
extra/different tokenizer attributes (Flux's tokenizer_2, SD3's
tokenizer_3, HiDream's tokenizer_3/tokenizer_4) can use the same helper.
Replaces the duplicated deepcopy + tokenizer __deepcopy__ workaround in
Chroma, Ernie, Flux, Flux2, HiDream, HunyuanVideo, PixArtAlpha, Qwen,
Sana, StableDiffusion3 and Z-Image with calls to the shared helper.
No behavior change.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Move the per-callsite checkpointing_or_offloading_enabled() guard into
enable_checkpointing() itself, so every Base*Setup can call
enable_checkpointing_for_* unconditionally. Also extend the central gate
to allow a compile-only path (no checkpointing/offloading, but still
per-layer torch.compile wrapping) when config.compile is set.

Three direct diffusers enable_gradient_checkpointing() calls (SD/SDXL
unet, Wuerstchen v2 prior) keep their explicit guard since they bypass
this central mechanism.
Ports the Ideogram 4 image generation model into OneTrainer, including
model loading/saving, data loading, sampling, training setups for LoRA
and Fine Tune, and corresponding UI and preset additions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dxqb dxqb added the preview merged in the preview branch label Jun 14, 2026
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dxqb dxqb mentioned this pull request Jun 14, 2026
3 tasks
@BobJohnson24

Copy link
Copy Markdown

I think that enabling cfg sampling for the cond with uncond model not loaded is a reasonable choice and does not need to be gated behind cfg 1.0. Not sure if posting it here or in the discord is better, just something I encountered and had an opinion on.

@dxqb

dxqb commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator Author

I think that enabling cfg sampling for the cond with uncond model not loaded is a reasonable choice and does not need to be gated behind cfg 1.0. Not sure if posting it here or in the discord is better, just something I encountered and had an opinion on.

the idea isn't to arbitrarily gate it - an unconditional is needed for cfg > 1.0.
Are you proposing that using the conditional transformer to get the unconditional works well? With an empty prompt?

@BobJohnson24

BobJohnson24 commented Jun 14, 2026

Copy link
Copy Markdown

I think that enabling cfg sampling for the cond with uncond model not loaded is a reasonable choice and does not need to be gated behind cfg 1.0. Not sure if posting it here or in the discord is better, just something I encountered and had an opinion on.

the idea isn't to arbitrarily gate it - an unconditional is needed for cfg > 1.0. Are you proposing that using the conditional transformer to get the unconditional works well? With an empty prompt?

Yeah, sorry I should have been more clear about this. Using the cond as the uncond with an empty prompt is indeed an approach that works "reasonably" well, i.e. good enough for fast sample previews, and what I would like to propose.

@dxqb

dxqb commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator Author

I think that enabling cfg sampling for the cond with uncond model not loaded is a reasonable choice and does not need to be gated behind cfg 1.0. Not sure if posting it here or in the discord is better, just something I encountered and had an opinion on.

the idea isn't to arbitrarily gate it - an unconditional is needed for cfg > 1.0. Are you proposing that using the conditional transformer to get the unconditional works well? With an empty prompt?

Yeah, sorry I should have been more clear about this. Using the cond as the uncond with an empty prompt is indeed an approach that works "reasonably" well, i.e. good enough for fast sample previews, and what I would like to propose.

ok, will add that. but also consider to just load the unconditional if you have the RAM.
with the new PR to set different offloading fractions for different components, you can train the conditional with no offloading efficiently, and then use maximum offloading on the unconditional transformer that's only used during sampling.

@BobJohnson24

Copy link
Copy Markdown

I think that enabling cfg sampling for the cond with uncond model not loaded is a reasonable choice and does not need to be gated behind cfg 1.0. Not sure if posting it here or in the discord is better, just something I encountered and had an opinion on.

the idea isn't to arbitrarily gate it - an unconditional is needed for cfg > 1.0. Are you proposing that using the conditional transformer to get the unconditional works well? With an empty prompt?

Yeah, sorry I should have been more clear about this. Using the cond as the uncond with an empty prompt is indeed an approach that works "reasonably" well, i.e. good enough for fast sample previews, and what I would like to propose.

ok, will add that. but also consider to just load the unconditional if you have the RAM. with the new PR to set different offloading fractions for different components, you can train the conditional with no offloading efficiently, and then use maximum offloading on the unconditional transformer that's only used during sampling.

Good point, I am not that short on ram. Personally I just didn't know if the lora would be applied on the uncond or not, and was trying to play it safe with the sampling.

dxqb added a commit that referenced this pull request Jun 14, 2026
@dxqb

dxqb commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator Author

Good point, I am not that short on ram. Personally I just didn't know if the lora would be applied on the uncond or not, and was trying to play it safe with the sampling.

the LoRA is currently not applied to the unconditional for sampling. maybe it should be, on Discord there was a good example that this is helpful (almost necessary according to that sample)

@dxqb dxqb linked an issue Jun 15, 2026 that may be closed by this pull request
dxqb added 4 commits June 18, 2026 01:24
# Conflicts:
#	modules/modelSetup/BaseErnieSetup.py
#	modules/modelSetup/BaseWuerstchenSetup.py
#	modules/util/checkpointing_util.py
#	training_presets/#flux2 LoRA 8GB.json
# Conflicts:
#	modules/ui/TimestepDistributionWindow.py
#	modules/ui/TrainingTab.py
Mirrors upstream commit 75a44d2, which converted the rest of the
codebase from the trailing factory.register() call to the @factory.register
decorator form.
dxqb added a commit that referenced this pull request Jun 19, 2026
dxqb added a commit that referenced this pull request Jun 19, 2026
…xt caching, autocast arg count)

Audit per WORKFLOW_PREVIEW.md checklist after merging #1522: IdeogramModel still
used the old to() API instead of release(), three files still referenced
config.latent_caching instead of image_caching/text_caching, and
BaseIdeogramSetup still called create_autocast_context/disable_fp16_autocast_context
with the old weight-list argument. Also adds the Ideogram LoRA preset to
run_lora_presets.sh, which was missing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

preview merged in the preview branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feat]: Support for Ideogram 4

2 participants