You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Prompt weighting provides a way to emphasize or de-emphasize certain parts of a prompt, allowing for more control over the generated image. A prompt can include several concepts, which gets turned into contextualized text embeddings. The embeddings are used by the model to condition its cross-attention layers to generate an image (read the Stable Diffusion [blog post](https://huggingface.co/blog/stable_diffusion) to learn more about how it works).
217
217
218
-
Prompt weighting works by increasing or decreasing the scale of the text embedding vector that corresponds to its concept in the prompt because you may not necessarily want the model to focus on all concepts equally. The easiest way to prepare the prompt-weighted embeddings is to use [Compel](https://github.com/damian0815/compel), a text prompt-weighting and blending library. Once you have the prompt-weighted embeddings, you can pass them to any pipeline that has a [`prompt_embeds`](https://huggingface.co/docs/diffusers/en/api/pipelines/stable_diffusion/text2img#diffusers.StableDiffusionPipeline.__call__.prompt_embeds) (and optionally [`negative_prompt_embeds`](https://huggingface.co/docs/diffusers/en/api/pipelines/stable_diffusion/text2img#diffusers.StableDiffusionPipeline.__call__.negative_prompt_embeds)) parameter, such as [`StableDiffusionPipeline`], [`StableDiffusionControlNetPipeline`], and [`StableDiffusionXLPipeline`].
218
+
Prompt weighting works by increasing or decreasing the scale of the text embedding vector that corresponds to its concept in the prompt because you may not necessarily want the model to focus on all concepts equally. The easiest way to prepare the prompt embeddings is to use [Stable Diffusion Long Prompt Weighted Embedding](https://github.com/xhinker/sd_embed) (sd_embed). Once you have the prompt-weighted embeddings, you can pass them to any pipeline that has a [prompt_embeds](https://huggingface.co/docs/diffusers/en/api/pipelines/stable_diffusion/text2img#diffusers.StableDiffusionPipeline.__call__.prompt_embeds) (and optionally [negative_prompt_embeds](https://huggingface.co/docs/diffusers/en/api/pipelines/stable_diffusion/text2img#diffusers.StableDiffusionPipeline.__call__.negative_prompt_embeds)) parameter, such as [`StableDiffusionPipeline`], [`StableDiffusionControlNetPipeline`], and [`StableDiffusionXLPipeline`].
219
219
220
220
<Tip>
221
221
222
222
If your favorite pipeline doesn't have a `prompt_embeds` parameter, please open an [issue](https://github.com/huggingface/diffusers/issues/new/choose) so we can add it!
223
223
224
224
</Tip>
225
225
226
-
This guide will show you how to weight and blend your prompts with Compel in 🤗 Diffusers.
226
+
This guide will show you how to weight your prompts with sd_embed.
227
227
228
-
Before you begin, make sure you have the latest version of Compel installed:
228
+
Before you begin, make sure you have the latest version of sd_embed installed:
You'll notice there is no "ball" in the image! Let's use compel to upweight the concept of "ball" in the prompt. Create a [`Compel`](https://github.com/damian0815/compel/blob/main/doc/compel.md#compel-objects) object, and pass it a tokenizer and text encoder:
compel uses `+`or `-` to increase or decrease the weight of a word in the prompt. To increase the weight of "ball":
245
+
To upweight or downweight a concept, surround the text with parentheses. More parentheses applies a heavier weight on the text. You can also append a numerical multiplier to the text to indicate how much you want to increase or decrease its weights by.
268
246
269
-
<Tip>
270
-
271
-
`+` corresponds to the value `1.1`, `++` corresponds to `1.1^2`, and so on. Similarly, `-` corresponds to `0.9` and `--` corresponds to `0.9^2`. Feel free to experiment with adding more `+` or `-` in your prompt!
247
+
| format | multiplier |
248
+
|---|---|
249
+
|`(hippo)`| increase by 1.1x |
250
+
|`((hippo))`| increase by 1.21x |
251
+
|`(hippo:1.5)`| increase by 1.5x |
252
+
|`(hippo:0.5)`| decrease by 4x |
272
253
273
-
</Tip>
254
+
Create a prompt and use a combination of parentheses and numerical multipliers to upweight various text.
274
255
275
256
```py
276
-
prompt ="a red cat playing with a ball++"
257
+
from sd_embed.embedding_funcs import get_weighted_text_embeddings_sdxl
258
+
259
+
prompt ="""A whimsical and creative image depicting a hybrid creature that is a mix of a waffle and a hippopotamus.
260
+
This imaginative creature features the distinctive, bulky body of a hippo,
261
+
but with a texture and appearance resembling a golden-brown, crispy waffle.
262
+
The creature might have elements like waffle squares across its skin and a syrup-like sheen.
263
+
It's set in a surreal environment that playfully combines a natural water habitat of a hippo with elements of a breakfast table setting,
264
+
possibly including oversized utensils or plates in the background.
265
+
The image should evoke a sense of playful absurdity and culinary fantasy.
Use the `get_weighted_text_embeddings_sdxl` function to generate the prompt embeddings and the negative prompt embeddings. It'll also generated the pooled and negative pooled prompt embeddings since you're using the SDXL model.
You can also create a weighted *blend* of prompts by adding `.blend()` to a list of prompts and passing it some weights. Your blend may not always produce the result you expect because it breaks some assumptions about how the text encoder functions, so just have fun and experiment with it!
280
+
> [!TIP]
281
+
> You can safely ignore the error message below about the token index length exceeding the models maximum sequence length. All your tokens will be used in the embedding process.
282
+
>
283
+
> ```
284
+
> Token indices sequence length is longer than the specified maximum sequence length for this model
285
+
> ```
328
286
329
287
```py
330
-
prompt_embeds = compel_proc('("a red cat playing with a ball", "jungle").blend(0.7, 0.8)')
A conjunction diffuses each prompt independently and concatenates their results by their weighted sum. Add `.and()` to the end of a list of prompts to create a conjunction:
344
-
345
-
```py
346
-
prompt_embeds = compel_proc('["a red cat", "playing with a", "ball"].and()')
> Refer to the [sd_embed](https://github.com/xhinker/sd_embed) repository for additional details about long prompt weighting for FLUX.1, Stable Cascade, and Stable Diffusion 1.5.
356
319
357
320
### Textual inversion
358
321
@@ -363,35 +326,63 @@ Create a pipeline and use the [`~loaders.TextualInversionLoaderMixin.load_textua
363
326
```py
364
327
import torch
365
328
from diffusers import StableDiffusionPipeline
366
-
from compel import Compel, DiffusersTextualInversionManager
Compel provides a `DiffusersTextualInversionManager` class to simplify prompt weighting with textual inversion. Instantiate `DiffusersTextualInversionManager` and pass it to the `Compel` class:
337
+
Add the `<midjourney-style>` text to the prompt to trigger the textual inversion.
Create a `Compel` class with a tokenizer and text encoder, and pass your prompt to it. Depending on the model you use, you'll need to incorporate the model's unique identifier into your prompt. For example, the `dndcoverart-v1` model uses the identifier `dndcoverart`:
Stable Diffusion XL (SDXL) has two tokenizers and text encoders so it's usage is a bit different. To address this, you should pass both tokenizers and encoders to the `Compel` class:
400
+
Depending on the model you use, you'll need to incorporate the model's unique identifier into your prompt. For example, the `dndcoverart-v1` model uses the identifier `dndcoverart`:
This time, let's upweight "ball" by a factor of 1.5 for the first prompt, and downweight "ball" by 0.6 for the second prompt. The [`StableDiffusionXLPipeline`] also requires [`pooled_prompt_embeds`](https://huggingface.co/docs/diffusers/en/api/pipelines/stable_diffusion/stable_diffusion_xl#diffusers.StableDiffusionXLInpaintPipeline.__call__.pooled_prompt_embeds) (and optionally [`negative_pooled_prompt_embeds`](https://huggingface.co/docs/diffusers/en/api/pipelines/stable_diffusion/stable_diffusion_xl#diffusers.StableDiffusionXLInpaintPipeline.__call__.negative_pooled_prompt_embeds)) so you should pass those to the pipeline along with the conditioning tensors:
449
-
450
-
```py
451
-
# apply weights
452
-
prompt = ["a red cat playing with a (ball)1.5", "a red cat playing with a (ball)0.6"]
453
-
conditioning, pooled = compel(prompt)
454
-
455
-
# generate image
456
-
generator = [torch.Generator().manual_seed(33) for _ inrange(len(prompt))]
0 commit comments