Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generated image in blur while training Autoencodertiny Decoder #19

Open
phyllispeng123 opened this issue Jun 17, 2024 · 1 comment
Open

Comments

@phyllispeng123
Copy link

Hi, this is Phyllis. Very appreciate your marvelous work on SD-VAE compression and acceleration !!!! I am currently working on Autoencodertiny decoder training from scratch using LDM training structure, but I find that the generated images are not very clear. The training is completely the same as LDM training apart from your extra loss distance(disc(real).mean(), disc(fake).mean()) in my decoder generator (the extra loss indeed helped with stability and FID ,many thanks!!!! ). I train the decoder using SD1.5 encoder output as my input for 190k steps with batch_size=4, lr = 1e-4, but generated images are still not clear. Would you mind give some hints (loss used, steps for training, any finetune stage? ) on how to align with your TAESD results?

this is TAESD result
20240617-152624

this is MY result
tiny_tiny_generated_images_199000_59d1bee81c7d1232debf

@phyllispeng123 phyllispeng123 changed the title Generated image in blur while training Autoencodertiny Generated image in blur while training Autoencodertiny Decoder Jun 17, 2024
@madebyollin
Copy link
Owner

Since the output is a mix of blurry and sharp portions, it looks like the scale of your adversarial loss / gradients is probably too low compared to the reconstruction (MSE/MAE) loss / gradients.

You could try just scaling up the discriminator loss (10 * distance(disc(real).mean(), disc(fake).mean()) or something), or using the automatic scaling from https://github.com/CompVis/latent-diffusion/blob/main/ldm/modules/losses/contperceptual.py#L32.

I also posted a simplified example training notebook using only adversarial loss here https://github.com/madebyollin/seraena/blob/main/TAESDXL_Training_Example.ipynb (which also does some automatic gradient scaling); you could try taking the adversarial loss from that and combining it with your reconstruction losses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants