Skip to content

Denoising SDXL iteration images for coherent image previews that a user could understand #7001

Open
@MonkeeMan1

Description

@MonkeeMan1

Hello,

I'm currently trying to create image previews with SDXL. This works! However, the image output are very noisy. A very long time ago I found a solution to this for sd1.5 but unfortunately it has been lost to time.

How would I go about denoising these images so they are a little more coherent to a human viewer? I know the first couple of iterations are always going to be very noisy, but eventually it should be possible to convert this noise into a blurry image that a human could understand.

import time
from diffusers import StableDiffusionXLPipeline
import torch

pipe = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda")

prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"

def callback(pipe, step_index, timestep, callback_kwargs):
    latents = callback_kwargs.get("latents")

    start_time = time.time()
    with torch.no_grad():
        pipe.upcast_vae()
        latents = latents.to(
            next(iter(pipe.vae.post_quant_conv.parameters())).dtype)
        images = pipe.vae.decode(
            latents / pipe.vae.config.scaling_factor, return_dict=False)[0]
        images = pipe.image_processor.postprocess(images, output_type='pil')

        images[0].save(f"./imgs/{step_index}.png")

    end_time = time.time()

    print(f"Time taken to generate image: {end_time - start_time} seconds")

    return callback_kwargs

pipe(prompt=prompt, callback_on_step_end=callback)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions