Add RAM Pressure cache mode #10454

rattus128 · 2025-10-23T14:45:02Z

Implement RAM Pressure cache

Implement a cache sensitive to RAM pressure. When RAM headroom drops
down below a certain threshold, evict RAM-expensive nodes from the
cache.

Models and tensors are measured directly for RAM usage. An OOM score
is then computed based on the RAM usage of the node and workflow staleness.

Note the due to indirection through shared objects (like a model
patcher), multiple nodes can account the same RAM as their individual
usage. The intent is this will free chains of nodes particularly
model loaders and associate loras as they all score similar and are
sorted in close to each other.

Has a helpful bias towards unloading model nodes mid flow while being able
to keep results like text encodings and VAE.

Example:

Linux, 64GB RAM system, RTX3060
WAN I2V template with FP16 Models
python main.py --novram --cache-ram 32.0

NOTE: You want to set the headroom significantly greater than your largest model.

At this point in time, it is running the low noise of WAN I2V after evicting the high noise.

After running another trivial workflow, and then returning the WAN, it recommences at the model loading and is able to use the cached TE and VAE results. The first nodes run are the UNETLoader (for high noise) -> Lora -> ModelSampling -> KSampler.

Currently the UI cache is parallel to the output cache with expectations of being a content superset of the output cache. At the same time the UI and output cache are maintained completely seperately, making it awkward to free the output cache content without changing the behaviour of the UI cache. There are two actual users (getters) of the UI cache. The first is the case of a direct content hit on the output cache when executing a node. This case is very naturally handled by merging the UI and outputs cache. The second case is the history JSON generation at the end of the prompt. This currently works by asking the cache for all_node_ids and then pulling the cache contents for those nodes. all_node_ids is the nodes of the dynamic prompt. So fold the UI cache into the output cache. The current UI cache setter now writes to a prompt-scope dict. When the output cache is set, just get this value from the dict and tuple up with the outputs. When generating the history, simply iterate prompt-scope dict. This prepares support for more complex caching strategies (like RAM pressure caching) where less than 1 workflow will be cached and it will be desirable to keep the UI cache and output cache in sync.

Implement a cache sensitive to RAM pressure. When RAM headroom drops down below a certain threshold, evict RAM-expensive nodes from the cache. Models and tensors are measured directly for RAM usage. An OOM score is then computed based on the RAM usage of the node. Note the due to indirection through shared objects (like a model patcher), multiple nodes can account the same RAM as their individual usage. The intent is this will free chains of nodes particularly model loaders and associate loras as they all score similar and are sorted in close to each other. Has a bias towards unloading model nodes mid flow while being able to keep results like text encodings and VAE.

Kosinkadink · 2025-10-25T21:37:11Z

@guill hey, would you be able to take a look to see if some of the changes (like ui cache being removed) seem all good?

asagi4 · 2025-10-26T15:20:56Z

I tried this and it seems to have some problems with either graph expansion or subgraphs.
I have a workflow where I get this:

!!! Exception during processing !!! 'NoneType' object is not subscriptable
Traceback (most recent call last):
  File "/home/sd/git/ComfyUI/execution.py", line 445, in execute
    node_output = execution_list.get_output_cache(source_node, unique_id)[source_output]
                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
TypeError: 'NoneType' object is not subscriptable

Adding some debug logging and a dumb workaround to force the execution to continue and it logs this:

Output cache is none for source_node='322:166.0.0.1' unique_id='322:166'
Output cache is none for source_node='322:166.0.0.1' unique_id='322:166'

Node 322:166 is my PCLazyLoRALoader node that dynamically expands into a LoRALoader. With the workaround it fails with an NoneType error later because the output of the node becomes None.

I'll try to see if I can get a simpler workflow to fail.

asagi4 · 2025-10-26T15:54:53Z

No subgraphs involved, it fails even with this simple workflow (--ram-cache 28 with 32GB of RAM on the host and 24GB of VRAM); works with the default cache.
fail_cache.json
In that workflow I get the following debug print:

Output cache is none for source_node='3.0.0.1' unique_id='3'

Just using LazyLoRALoader with a Preview Any node for the model output doesn't seem to be enough to trigger it, so I'm not sure what exactly the problem is.

rattus128 added 5 commits October 23, 2025 21:28

sd: Implement RAM getter for VAE

708f002

model_patcher: Implement RAM getter for ModelPatcher

d34fb4d

sd: Implement RAM getter for CLIP

0c95f22

rattus128 force-pushed the prs/ram-cache branch from 4f4c551 to f3f526f Compare October 23, 2025 14:51

Kosinkadink marked this pull request as ready for review October 24, 2025 23:47

Kosinkadink self-requested a review as a code owner October 24, 2025 23:47

Kosinkadink added the Core Core team dependency label Oct 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add RAM Pressure cache mode #10454

Add RAM Pressure cache mode #10454

rattus128 commented Oct 23, 2025 •

edited

Loading

Uh oh!

Kosinkadink commented Oct 25, 2025

Uh oh!

asagi4 commented Oct 26, 2025

Uh oh!

asagi4 commented Oct 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add RAM Pressure cache mode #10454

Are you sure you want to change the base?

Add RAM Pressure cache mode #10454

Conversation

rattus128 commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Kosinkadink commented Oct 25, 2025

Uh oh!

asagi4 commented Oct 26, 2025

Uh oh!

asagi4 commented Oct 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rattus128 commented Oct 23, 2025 •

edited

Loading