Possible artificing happening for long prompt and combining prompts #6930

Scorpinaus · 2024-02-10T02:34:50Z

Scorpinaus
Feb 10, 2024

Hello diffusers team,

Seen this link by hako-mikan: AUTOMATIC1111/stable-diffusion-webui#14874 and their civitai report: Analyzing and Addressing Artifacts in Web-UI with XL-models.

In summary, does diffusers follow a similar generation process like A1111 webui where there is a normalization process happening after the prompt emphasis is calculated? Additionally this seems to be exclusive for SDXL (atleast for A1111 webui), but unsure of the effect it has on the diffusers platform, because the possible different method.

Credits to @hako-mikan for the interesting find!

asomoza · 2024-02-10T05:25:17Z

asomoza
Feb 10, 2024
Maintainer

Hi, diffusers doesn’t do either of them. By default, Diffusers doesn’t use prompt weighting or allow long prompts is up to the user how to enable this.

One popular library to enable prompt weighting is compel so you can ask there.

Another good solution is the community pipeline lpw_stable_diffusion_xl

That pipeline doesn't use torch.mean or the same operation as automatic1111 so it shouldn't be affected, also the equivalent prompt the article uses to test it doesn't create a corrupt image with animagine.

0 replies

spezialspezial · 2024-02-13T19:38:58Z

spezialspezial
Feb 13, 2024

The StableDiffusionLongPromptWeightingPipeline does similar though just not on the fragment level but for the whole embedding.
"Also, to regularize of the embedding, the weighted embedding would be scaled to preserve the original mean." See line 356

0 replies

asomoza · 2024-02-14T02:29:56Z

asomoza
Feb 14, 2024
Maintainer

This is what I see in the line 356 (edit: I found you were referring to the SD 1.5 pipeline):

diffusers/examples/community/lpw_stable_diffusion_xl.py

Line 356 in 0ca7b68

    
           weight_tensor = torch.tensor(prompt_weight_groups[i], dtype=torch.float16, device=device)

Still, It will always have to do something similar since you need to do some kind of strategy to fit a longer prompt in 75 tokens but going with the article it seems that the problem is with this operation done in automatic1111:

z = z * (original_mean / new_mean)

Which is not done in SDXL version of the pipeline since this seems to be a problem with only SDXL and not SD 1.5.

I must admit I haven't gone to deep into this since I don't understand why would anyone try to play the lottery with SDXL and prompt with 300 tags to generate something when SDXL understands if you just write what you want.

0 replies

spezialspezial · 2024-02-14T09:35:22Z

spezialspezial
Feb 14, 2024

Ah, seems you are right. Didn't expect this part to change between sd1 and xl pipelines as it alters outcomes a lot. Just remembered it following auto1111 when I was looking at it.

Whether one should be 75-token-frugal with prompts is a philosophical question. Great start for a discussion but I'll see myself out on that. :)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible artificing happening for long prompt and combining prompts #6930

{{title}}

Replies: 4 comments

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Possible artificing happening for long prompt and combining prompts #6930

Scorpinaus Feb 10, 2024

Replies: 4 comments

asomoza Feb 10, 2024 Maintainer

spezialspezial Feb 13, 2024

asomoza Feb 14, 2024 Maintainer

spezialspezial Feb 14, 2024

Scorpinaus
Feb 10, 2024

asomoza
Feb 10, 2024
Maintainer

spezialspezial
Feb 13, 2024

asomoza
Feb 14, 2024
Maintainer

spezialspezial
Feb 14, 2024