on-demand loading of text encoders by dxqb · Pull Request #1509 · Nerogar/OneTrainer

dxqb · 2026-06-06T11:45:34Z

Summary

Text encoders mostly sit in RAM, and are only moved to VRAM for caching and sampling.
This PR introduces a mechanism to not load the text encoder at all, and load it directly from disk onto the GPU whenever it is needed.
This is needed by the Lens model, because it doesn't seem to be possible to move the quantized GTP-OSS encoder between CPU and GPU: microsoft/Lens#11

It might also be useful for other models (to save RAM), but this PR doesn't implement it for any other models.

includes #1476

Test plan

pre-commit run --all-files passes
Launched the affected UI or script and exercised the change
Tested with at least one real preset / config when relevant (note which: Lens)

AI assistance

Early AI prototype — opened for discussion, not ready for review

…model composition in ModelType - Gradient checkpointing and layer offloading are now configured per component (text encoder, transformer, VAE) rather than globally - ModelType centralizes model composition and training method associations Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Introduces OnDemandModule, a persistent delegating proxy for text encoders that must be loaded on demand and freed after use rather than parked on the CPU temp device. Adds load_on_demand per-component config and four text_encoder_N_on_demand() resolvers in TrainConfig. BaseModel.to(device) is removed as an abstract method; release() is now the sole abstract method for parking a model. Each concrete model reads self.train_config.temp_device directly. Call sites in modelSetup, dataLoader, trainer, and SampleWindow are updated to model.release(). Co-Authored-By: dxqb <183307934+dxqb@users.noreply.github.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Several models' release() forwarded self.train_config.temp_device (a str) directly to *_to() methods typed as device: torch.device. This crashes inside LayerOffloadConductor.to() when layer/block-swap offloading is enabled, since it accesses device.type. nn.Module.to() tolerates str so the bug was latent for runs without offloading enabled. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Resolves conflict in the Flux2 LoRA 8GB preset: keeps this branch's per-component offload_fraction scheme and drops the superseded top-level gradient_checkpointing/layer_offload_fraction fields, while picking up master's dynamic_timestep_shifting addition. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…) rename The rename to release() in this PR accidentally dropped the eval() call that used to follow to(temp_device) before caching and before sampling. Without it, the model stays in train() mode during in-training sampling, which breaks models whose forward pass branches on self.training (e.g. HiDream's unpatchify). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

# Conflicts: # modules/modelSetup/BaseErnieSetup.py # modules/modelSetup/BaseWuerstchenSetup.py # modules/util/checkpointing_util.py

dxqb and others added 4 commits May 25, 2026 18:11

Merge branch 'master' into split-offload

5a41835

Merge branch 'upstream' into split-offload

0b4ddc4

dxqb mentioned this pull request Jun 6, 2026

New model: Microsoft Lens #1510

Draft

5 tasks

dxqb added the preview merged in the preview branch label Jun 13, 2026

This comment was marked as resolved.

Sign in to view

dxqb added a commit that referenced this pull request Jun 14, 2026

Merge PR #1509 (Refactor for ondemand loading) into preview

33386ae

This comment was marked as resolved.

Sign in to view

dxqb mentioned this pull request Jun 17, 2026

Upgrade transformers to 5.5.4 and huggingface-hub to 1.16 #1524

Merged

3 tasks

dxqb and others added 2 commits June 17, 2026 20:48

dxqb mentioned this pull request Jun 17, 2026

HiDream LoRA training crashes during sampling: tensor shape mismatch #1541

Closed

dxqb and others added 2 commits June 18, 2026 00:24

Merge remote-tracking branch 'Nerogar/master' into ondemand-base

5564662

# Conflicts: # modules/modelSetup/BaseErnieSetup.py # modules/modelSetup/BaseWuerstchenSetup.py # modules/util/checkpointing_util.py

dxqb added a commit that referenced this pull request Jun 19, 2026

Merge PR #1509 (on-demand loading of text encoders) into preview

3b7f241

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

on-demand loading of text encoders#1509

on-demand loading of text encoders#1509
dxqb wants to merge 8 commits into
Nerogar:masterfrom
dxqb:ondemand-base

dxqb commented Jun 6, 2026

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

dxqb commented Jun 6, 2026

Summary

Test plan

AI assistance

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant