Train a diffusion model using own dataset of remote sensing data #7505

CFOP-xyn · 2024-03-28T12:42:59Z

CFOP-xyn
Mar 28, 2024

Hello the community，
I wanted to train an LDM using remote sensing data, and due to the difference in data sources, I re-used the remotely sensed data to fine-tune the VAE. According to the regular process of training LDM, we will encode the training data with VAE and then train the LDM. however, after I train 10k steps the loss still doesn't show a decreasing trend, I am very confused about this.
For the specific model I am using MDT (a variant of DiT) and training with the smallest version of it (12 DiT Blocks). One phenomenon I noticed during training is that when I trained on pixel space (another independent experiment), the loss dropped very fast and the model converged very quickly; however, when training on latent space, the loss hardly changed. I suspected that it was a problem of training VAE by myself, then I observed my VAE encoded vectors and used t-SNE to visualize the downscaling as below (20k vectors and 100k vectors), but again, it felt like nothing was wrong.
I use scale=1.2 for normalization when training the LDM (1.2 is obtained by counting the standard deviation of the encoded vectors) Anyone know what went wrong? thanks!!!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Train a diffusion model using own dataset of remote sensing data #7505

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Train a diffusion model using own dataset of remote sensing data #7505

CFOP-xyn Mar 28, 2024

Replies: 0 comments

CFOP-xyn
Mar 28, 2024