Skip to content

[docs] Caching methods #11625

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 2, 2025
Merged

[docs] Caching methods #11625

merged 2 commits into from
Jun 2, 2025

Conversation

stevhliu
Copy link
Member

Make the caching docs more visible and give a little more context behind the methods.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@stevhliu stevhliu requested review from a-r-r-o-w and sayakpaul May 28, 2025 20:30
Copy link
Member

@a-r-r-o-w a-r-r-o-w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into it @stevhliu! Just some comments regarding the correctness of the explanations and more technical details (which is fine to skip if you think it puts a lot of burden in front of the user)


## FasterCache

[FasterCache](https://huggingface.co/papers/2410.19355) computes and caches attention features at every other timestep instead of directly reusing cached features because it can cause flickering or blurry details in the generated video. The features from the skipped step are calculated from the difference between the adjacent cached features.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[FasterCache](https://huggingface.co/papers/2410.19355) computes and caches attention features at every other timestep instead of directly reusing cached features because it can cause flickering or blurry details in the generated video. The features from the skipped step are calculated from the difference between the adjacent cached features.
[FasterCache](https://huggingface.co/papers/2410.19355) caches and reuses attention features in a similar manner to PAB, as output differences in successive timesteps of the generation process is small. Additionally, when using classifier-free guidance for sampling (commonly used in most base models), FasterCache may choose to skip the unconditional branch prediction entirely, and estimate it from the conditional branch prediction, if there is a significant redundancy in the predicted latent outputs between successive timesteps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cc: @sunovivid you might be interested.

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks to @a-r-r-o-w for the suggestions as well. I would be in favor of keeping the technincal details.

Two things:

  • Include a table to report timing and memory numbers so that users can know the trade-offs (can happen in a follow-up PR).
  • If we have info if the caching methods are generally model-agnostic, having an explicit note about it would be useful.

@stevhliu
Copy link
Member Author

stevhliu commented Jun 2, 2025

Thanks for the reviews!

Happy to include a table in follow-up if someone can provide me with the timing and memory numbers (or the code to generate that)!

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments I brought up can definitely be included in the follow-up.

@stevhliu stevhliu merged commit 9f48394 into huggingface:main Jun 2, 2025
1 check passed
@stevhliu stevhliu deleted the cache branch June 2, 2025 17:58
@DN6 DN6 added the roadmap Add to current release roadmap label Jun 5, 2025
@DN6 DN6 moved this from In Progress to Done in Diffusers Roadmap 0.34 Jun 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
roadmap Add to current release roadmap
Projects
Development

Successfully merging this pull request may close these issues.

5 participants