You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, this is Phyllis. Very appreciate your marvelous work on SD-VAE compression and acceleration !!!! I am currently working on Autoencodertiny decoder training from scratch using LDM training structure, but I find that the generated images are not very clear. The training is completely the same as LDM training apart from your extra loss distance(disc(real).mean(), disc(fake).mean()) in my decoder generator (the extra loss indeed helped with stability and FID ,many thanks!!!! ). I train the decoder using SD1.5 encoder output as my input for 190k steps with batch_size=4, lr = 1e-4, but generated images are still not clear. Would you mind give some hints (loss used, steps for training, any finetune stage? ) on how to align with your TAESD results?
this is TAESD result
this is MY result
The text was updated successfully, but these errors were encountered:
phyllispeng123
changed the title
Generated image in blur while training Autoencodertiny
Generated image in blur while training Autoencodertiny Decoder
Jun 17, 2024
Since the output is a mix of blurry and sharp portions, it looks like the scale of your adversarial loss / gradients is probably too low compared to the reconstruction (MSE/MAE) loss / gradients.
I also posted a simplified example training notebook using only adversarial loss here https://github.com/madebyollin/seraena/blob/main/TAESDXL_Training_Example.ipynb (which also does some automatic gradient scaling); you could try taking the adversarial loss from that and combining it with your reconstruction losses.
Hi, this is Phyllis. Very appreciate your marvelous work on SD-VAE compression and acceleration !!!! I am currently working on Autoencodertiny decoder training from scratch using LDM training structure, but I find that the generated images are not very clear. The training is completely the same as LDM training apart from your extra loss
distance(disc(real).mean(), disc(fake).mean())
in my decoder generator (the extra loss indeed helped with stability and FID ,many thanks!!!! ). I train the decoder using SD1.5 encoder output as my input for190k steps with batch_size=4, lr = 1e-4
, but generated images are still not clear. Would you mind give some hints (loss used, steps for training, any finetune stage? ) on how to align with your TAESD results?this is TAESD result

this is MY result

The text was updated successfully, but these errors were encountered: