Does diffusers have any automated testing to ensure the same inputs give the same outputs between versions? #11057
Replies: 2 comments 2 replies
-
Hi, yes we have tests that ensure that, you didn't write which model were you training but you can read the tests, for example for SDXL there is this test among others. All the pipelines have similar tests. Also there was a change in the slices because transformers added FA2 I think and this affected the text encoders, sound really similar to what you're commenting but if I understand correctly what you're saying is that recent diffusers versions have worse results? |
Beta Was this translation helpful? Give feedback.
-
Heya thanks, that's good to confirm and in retrospect it seems pretty likely, but it should help me debug this issue more easily without writing those same tests again. The training was for SD1.5, using diffusers @ a9d3f6 and transformers==4.46.3. I've reverted all my dependencies to a much older version with requirements initially copied from OneTrainer, with github.com/kashif/diffusers.git@a3dc213 and transformers==4.36.2, and have started training again, and the issue seems to finally be resolved, but it could have been any number of dependencies which were the cause, assuming it's also not just blind luck. It might have also been an incorrect use of the VAE on my part, since I added this step to be SD3 compatible and which perhaps shouldn't have ever been called for SD1.5, which had no shift factor: if vae.config.shift_factor is not None: Some other changes were: (old) (new) And many other dependency changes which might have caused it. |
Beta Was this translation helpful? Give feedback.
-
I am trying to debug a problem which has appeared in my Stable Diffusion training script where people now frequently generate with extra limbs or corrupted faces in trained models, issues which seem perhaps even present in the pre-training previews generated to ensure that the model is working, though it is very hard to say. It doesn't seem explainable with any changes in the dataset, and even filtering to train on only my highest quality data and excluding any complex poses, the problem still has persisted for over a month.
While trying to find possible explanations, I noticed that I updated my requirements.txt versions for most packages on the date this problem seem to have started, so have reverted to a very old version and am trying training again, and so far the previews look much higher quality. If this is the cause, there are a dozen possible explanations between torch version changes, optimizer upgrades, etc, though part of me wonders if it might somehow be related to Diffusers or Transformers (I mostly suspect this issue lies in the text encoding somewhere, as when I tried using an older finetuned text encoder the problem seem greatly reduced and more in line with previous results).
If my results do improve, I'm going to have to try upgrading requirements bit by bit and re-training which is costly in terms of time, but am wondering if the core packages such as diffusers and transformers might already have tests to confirm that their outputs would be generally persistent between versions and that it's probably not worth stressing about those too much.
Beta Was this translation helpful? Give feedback.
All reactions