-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stable Diffusion XL for Gaudi #619
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi, @dsocek, can you start to check the comments? thanks
optimum/habana/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_xl.py
Outdated
Show resolved
Hide resolved
optimum/habana/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_xl.py
Outdated
Show resolved
Hide resolved
optimum/habana/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_xl.py
Outdated
Show resolved
Hide resolved
optimum/habana/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_xl.py
Outdated
Show resolved
Hide resolved
optimum/habana/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_xl.py
Outdated
Show resolved
Hide resolved
optimum/habana/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_xl.py
Outdated
Show resolved
Hide resolved
optimum/habana/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_xl.py
Outdated
Show resolved
Hide resolved
optimum/habana/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_xl.py
Outdated
Show resolved
Hide resolved
optimum/habana/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_xl.py
Outdated
Show resolved
Hide resolved
optimum/habana/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_xl.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Daniel Socek <[email protected]>
Updated schedulers to support cases like image-to-image generation with different initial timesteps.
Signed-off-by: Daniel Socek <[email protected]>
Signed-off-by: Daniel Socek <[email protected]>
Also fixes bug when generated images not divisible by batch size.
Signed-off-by: Daniel Socek <[email protected]>
Signed-off-by: Daniel Socek <[email protected]>
Signed-off-by: Daniel Socek <[email protected]>
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@regisss could you also please provide your insights on this PR? |
Signed-off-by: Daniel Socek <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very clean PR! I left a few minor comments.
Have you benchmarked it a bit and got some throughput numbers?
optimum/habana/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_xl.py
Outdated
Show resolved
Hide resolved
optimum/habana/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_xl.py
Outdated
Show resolved
Hide resolved
optimum/habana/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_xl.py
Outdated
Show resolved
Hide resolved
model_cpu_offload_seq = "text_encoder->text_encoder_2->unet->vae" | ||
_optional_components = ["tokenizer", "tokenizer_2", "text_encoder", "text_encoder_2"] | ||
_callback_tensor_inputs = [ | ||
"latents_batch", | ||
"text_embeddings_batch", | ||
"add_text_embeddings_batch", | ||
"add_time_ids_batch", | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove these lines as the class already inherits them from StableDiffusionXLPipeline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. I also tried to be more correct in handling other callback input (maybe similar fix to SD pipeline? there is some pairing that takes place which is not addressed in original gaudi sd pipe).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm not sure I got it, it doesn't recognize the tensors to callback properly?
And so you modified the default value in __call__
right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not 100% confident how this should be handled. Initially what I tried is to define _callback_tensor_inputs directly for current derived class:
_callback_tensor_inputs = [
"latents_batch",
"text_embeddings_batch",
"add_text_embeddings_batch",
"add_time_ids_batch",
]
So on callbacks these inputs can be popped out of stack when callback occurs.
Now, in the current PR version, instead of that approach I tried to pass callback input tensor from the base class and then adjust them properly by doing the pairing (via torch.cat) to align with batched structures created with _split_inputs_into_batches
inside Gaudi class:
callback input tensor from the base class:
[optimum/habana/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py]
Line 310:
callback_on_step_end_tensor_inputs: List[str] = [
"latents",
"prompt_embeds",
"negative_prompt_embeds",
"add_text_embeds",
"add_time_ids",
"negative_pooled_prompt_embeds",
"negative_add_time_ids",
],
and then pairing then manually when popped from callback stack:
[optimum/habana/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py]
Lines 722-734
I am not sure what would be the best/correct implementation for this part..
I am also confused about the original stable diffusion pipeline for Gaudi. If you look at the how its implemented, there are also inherited callback input tenors:
https://github.com/huggingface/optimum-habana/blob/main/optimum/habana/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py#L222
callback_on_step_end_tensor_inputs: List[str] = ["latents"],
However in in lines 474-476 it pops both latents and prompt_embeds. Here, prompt_embeds are popped into text_embeddings_batch which expects a catenated version (see https://github.com/huggingface/optimum-habana/blob/main/optimum/habana/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py#L193)
https://github.com/huggingface/optimum-habana/blob/main/optimum/habana/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py#L474
That looks kind of fishy, unless I am completely missing something obvious :)
optimum/habana/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_xl.py
Outdated
Show resolved
Hide resolved
Yes indeed we benchmarked it on single Gaudi2 HPU. Here is a snapshot:
|
Signed-off-by: Daniel Socek <[email protected]>
Signed-off-by: Daniel Socek <[email protected]>
Add negative_prompt_embeds and negative_pooled_prompt_embeds check for sdxl turbo.
Co-authored-by: regisss <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
What does this PR do?