[Refactor] How attention is set in 3D UNet blocks #6893

DN6 · 2024-02-07T12:53:26Z

What does this PR do?

There's some confusion around how num_attention_heads is being used in the 3D UNet + Block. This PR attempts to fix the issue by correcting the values used for num_attention_heads

Related:
#6872
#6873

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

HuggingFaceDocBuilderDev · 2024-02-07T13:06:06Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

pcuenca · 2024-02-07T13:55:40Z

src/diffusers/models/unets/unet_3d_condition.py

        # If `num_attention_heads` is not defined (which is the case for most models)
        # it will default to `attention_head_dim`. This looks weird upon first reading it and it is.
        # The reason for this behavior is to correct for incorrectly named variables that were introduced
        # when this library was created. The incorrect naming was only discovered much later in https://github.com/huggingface/diffusers/issues/2011#issuecomment-1547958131
        # Changing `attention_head_dim` to `num_attention_heads` for 40,000+ configurations is too backwards breaking
        # which is why we correct for the naming here.
-        num_attention_heads = num_attention_heads or attention_head_dim


Also remove or update the comment above, assuming num_attention_heads is replicated in the hub configs. Do we know how many models like https://huggingface.co/ali-vilab/i2vgen-xl/blob/6c4e9e70bdcd36eb59d98d2b583adea0813ea8de/unet/config.json#L21 do we need to update?

Will we need to live forever with duplicated property names in the hub?

just this one model and it was just out a few days ago
IMO it's an edge case we don't mind breaking - it will only affect people who want to use the local copy, no?

yiyixuxu

I like this idea! but I think it's not backward-compatible for CrossAttnDownBlock3D, CrossAttnUpBlock3D, UNetMidBlock3DCrossAttn, get_up_block and get_down_block

Do you have a good idea to get around that?

yiyixuxu · 2024-02-07T16:58:29Z

src/diffusers/models/unets/unet_3d_blocks.py

                    num_attention_heads,
+                    out_channels // num_attention_heads,


CrossAttnDownBlock3D is part of our public API and this is a breaking change, no?

Sorry, why isn't it backwards compatible? None of the args in the class init are being changed right?

The 3D blocks and the 3D UNet are only used with the Text to Video Synth and I2VGenXL model in the library.

Same with get_up_block and get_down_block. We're only changing the number being passed in to the num_attention_heads argument.

change the meaning of an argument is breaking, no?

for example,

CrossAttnDownBlock3D(... num_attention_heads = 64) is currently expected to create attentions with head_dim=64; with this code change, it will create attentions with 64 heads instead

The 3D blocks and the 3D UNet are only used with the Text to Video Synth and I2VGenXL model in the library.

yes but it is our public API and we have to assume it's been used outside of the library

I think this might be a relatively safe change. Searching Github public repos for an import of these blocks from the public API doesn't return any results. I actually don't think you can import it directly from diffusers

https://github.com/search?q=%22from+diffusers.models+import+CrossAttnDownBlock3D%22+language:Python+&type=code
https://github.com/search?q=%22from+diffusers+import+CrossAttnUpBlock3D%22+language:Python+&type=code
https://github.com/search?q=%22from+diffusers+import+UNetMidBlock3DCrossAttn%22+language:Python+&type=code

It looks like more often than not people redefine the blocks themselves
https://github.com/search?q=%22CrossAttnDownBlock3D%22+language:Python+&type=code&p=5

Even getting specific with the imports doesn't return a lot of results
https://github.com/search?q=%22from+diffusers.models.unet_3d_blocks+import+CrossAttnDownBlock3D%22+language:Python+&type=code

cc @pcuenca and @patrickvonplaten here.
I would like to hear your thoughts about when we can make breaking changes (other than v1.0.0).

Personally, I think we should only make breaking changes when we don't have another choice, or we know super confidently it is an edge case (e.g., if we just added these blocks yesterday, I would think it's ok to break here).
I think in this case,
(1) we do not have to make these changes: we want to make these changes to make our code more readable and easier for contributors to contribute, but it is not a must and this is not the only way to go
(2) we don't really have a way to find out about its usage outside github

also, I think a break change is somewhat more acceptable if we are able to throw an error. In this case, it will just be breaking silently so IMO it is worse

But I'm curious about your thoughts on this and I'm cool with it if you all feel strongly about making this change here :)

python -c "from diffusers import CrossAttnDownBlock3D"

doesn't work, so strictly speaking CrossAttnDownBlock3D is not considered part of the public API. Also I don't think it's used that much so IMO it's ok to change it here (while making sure though that this might lead to breaking changes depending on how CrossAttnDownBlock3D is imported.

patrickvonplaten · 2024-02-09T13:12:26Z

src/diffusers/models/unets/unet_3d_condition.py

@@ -132,13 +132,19 @@ def __init__(
                "At the moment it is not possible to define the number of attention heads via `num_attention_heads` because of a naming issue as described in https://github.com/huggingface/diffusers/issues/2011#issuecomment-1547958131. Passing `num_attention_heads` will only be supported in diffusers v0.19."
            )

+        if isinstance(attention_head_dim, int):
+            num_attention_heads = [out_channels // attention_head_dim for out_channels in block_out_channels]


That's a good idea.

Are we sure though that this is always correct? Does out_channels always represent the hidden_dim of the attention layer?

For the 3D UNets this is safe. There are a limited number of blocks used with this model CrossAttnDownBlock3D, CrossAttnUpBlock3DandUNetMidBlock3DCrossAttnand they all configurenum_attention_heads` based on the out_channels.

patrickvonplaten · 2024-02-09T13:19:59Z

src/diffusers/models/unets/unet_3d_condition.py

@@ -132,13 +132,19 @@ def __init__(
                "At the moment it is not possible to define the number of attention heads via `num_attention_heads` because of a naming issue as described in https://github.com/huggingface/diffusers/issues/2011#issuecomment-1547958131. Passing `num_attention_heads` will only be supported in diffusers v0.19."


I would prefer to try to remove this statement

patrickvonplaten · 2024-02-14T13:57:35Z

src/diffusers/models/unets/unet_3d_blocks.py

-                    out_channels // num_attention_heads,
                    num_attention_heads,
+                    out_channels // num_attention_heads,


Let's use key word arguments here when correcting it.

github-actions · 2024-03-09T15:02:37Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

yiyixuxu · 2024-03-09T18:22:31Z

@DN6 let's finish up this soon:)

github-actions · 2024-04-03T15:03:32Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

DN6 · 2024-04-10T16:14:37Z

@yiyixuxu Anything specific you wanted to address in this current PR?

DN6 · 2024-04-10T16:15:20Z

Oh wait. Nvm I haven't addressed those initial comments. Will wrap this up.

github-actions · 2024-05-05T15:04:06Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions · 2024-09-14T15:19:10Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions · 2024-10-12T15:09:39Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions · 2024-11-08T15:07:38Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

update

43dceee

DN6 mentioned this pull request Feb 7, 2024

[WIP]correct the attn naming for UNet3DConditionModel #6873

Closed

update

2359233

pcuenca reviewed Feb 7, 2024

View reviewed changes

yiyixuxu reviewed Feb 7, 2024

View reviewed changes

patrickvonplaten reviewed Feb 9, 2024

View reviewed changes

patrickvonplaten reviewed Feb 14, 2024

View reviewed changes

github-actions bot added the stale Issues that haven't received updates label Mar 9, 2024

yiyixuxu removed the stale Issues that haven't received updates label Mar 9, 2024

github-actions bot added the stale Issues that haven't received updates label Apr 3, 2024

DN6 removed the stale Issues that haven't received updates label Apr 8, 2024

DN6 marked this pull request as ready for review April 8, 2024 05:35

Merge branch 'main' into 3d-attn-fix

d2bb8cf

github-actions bot added the stale Issues that haven't received updates label May 5, 2024

DN6 removed the stale Issues that haven't received updates label May 6, 2024

yiyixuxu mentioned this pull request Jun 28, 2024

correct attention_head_dim for JointTransformerBlock #8608

Merged

github-actions bot added the stale Issues that haven't received updates label Sep 14, 2024

yiyixuxu added refactor and removed stale Issues that haven't received updates labels Sep 17, 2024

github-actions bot added the stale Issues that haven't received updates label Oct 12, 2024

yiyixuxu removed the stale Issues that haven't received updates label Oct 15, 2024

github-actions bot added the stale Issues that haven't received updates label Nov 8, 2024

		@@ -132,13 +132,19 @@ def __init__(
		"At the moment it is not possible to define the number of attention heads via `num_attention_heads` because of a naming issue as described in https://github.com/huggingface/diffusers/issues/2011#issuecomment-1547958131. Passing `num_attention_heads` will only be supported in diffusers v0.19."

[Refactor] How attention is set in 3D UNet blocks #6893

Are you sure you want to change the base?

[Refactor] How attention is set in 3D UNet blocks #6893

Uh oh!

Conversation

DN6 commented Feb 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Feb 7, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Feb 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DN6 Feb 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Feb 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DN6 Feb 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Feb 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten Feb 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten Feb 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 9, 2024

Uh oh!

yiyixuxu commented Mar 9, 2024

Uh oh!

github-actions bot commented Apr 3, 2024

Uh oh!

DN6 commented Apr 10, 2024

Uh oh!

DN6 commented Apr 10, 2024

Uh oh!

github-actions bot commented May 5, 2024

Uh oh!

github-actions bot commented Sep 14, 2024

Uh oh!

github-actions bot commented Oct 12, 2024

Uh oh!

github-actions bot commented Nov 8, 2024

Uh oh!

Uh oh!

DN6 commented Feb 7, 2024 •

edited

Loading

yiyixuxu Feb 7, 2024 •

edited

Loading

DN6 Feb 7, 2024 •

edited

Loading

yiyixuxu Feb 7, 2024 •

edited

Loading

DN6 Feb 7, 2024 •

edited

Loading

yiyixuxu Feb 7, 2024 •

edited

Loading

patrickvonplaten Feb 9, 2024 •

edited

Loading

patrickvonplaten Feb 9, 2024 •

edited

Loading