Replies: 3 comments 11 replies
-
This is an interesting discussion and will help us as well. Hoping the team can respond to this. |
Beta Was this translation helpful? Give feedback.
-
Thanks for opening this discussion. Does increasing the learning rate help? Additionally, this script might be helpful for training large-scale: This is what we used for https://huggingface.co/collections/diffusers/sdxl-controlnets-64f9c35846f3f06f5abe351f |
Beta Was this translation helpful? Give feedback.
-
Thanks for the quick reply Sayak.
|
Beta Was this translation helpful? Give feedback.
-
Hello, I am trying to train a controlnet model using the provided script that is conditioned on semantic segmentations. I am looking for a general gauge for how many steps for sudden convergence on sdxl controlnet especially from the diffusers team who have already trained sdxl controlnets.
For the dataset, I am using the ADE20k dataset (20k image pairs). For captions, I am using blip.
I have completed around 15k steps with learning rate of 1e-5, constant learning rate scheduler, gradient accumulation of 1 and batch size of 6, but still have not noticed any sudden convergence.
If anyone from the diffuser team reads this, did you notice any patterns in terms of the step range in which convergence happens during the training of the diffuser's sdxl controlnet? Any advice on the hyper-parameters, mixed-precision config, etc?
Any advice would be appreciated!
Beta Was this translation helpful? Give feedback.
All reactions