-
Notifications
You must be signed in to change notification settings - Fork 6.1k
Add Option to use Target Model in LCM-LoRA Scripts #6537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add Option to use Target Model in LCM-LoRA Scripts #6537
Conversation
@@ -1388,6 +1433,8 @@ def compute_embeddings(prompt_batch, proportion_empty_prompts, text_encoder, tok | |||
|
|||
# Checks if the accelerator has performed an optimization step behind the scenes | |||
if accelerator.sync_gradients: | |||
# 12. If using a target model, update its parameters via EMA. | |||
update_ema(target_unet.parameters(), unet.parameters(), args.ema_decay) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this works? I had errors with the LoRA parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a workaround in the issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both are lora weights, so they should work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't been able to test this fully yet, it's possible that this runs into the errors mentioned in #6505 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the current implementation is correctly updating the LoRA parameters.
@@ -657,6 +671,24 @@ def parse_args(): | |||
help="Proportion of image prompts to be replaced with empty strings. Defaults to 0 (no prompt replacement).", | |||
) | |||
# ----Latent Consistency Distillation (LCD) Specific Arguments---- | |||
parser.add_argument( | |||
"--use_target_model", | |||
action="store_true", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this default to false so existing users are not surprised?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The target model will be used only if the --use_target_model
flag is specified (so existing script calls should work as before).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. My bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good to me. Could you explain the behind this ? Do you have any experiments that demonstrate the use of this ?
#6505 (comment) reports seeing a lot of training instability when training using the current LCM-LoRA script. @jon-chuang, would you be willing to share more details about the training instability? |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
I don’t have conclusive evidence that this fix will mitigate, but what I observed was some divergent results on LCM-LoRA training runs. Anw, I think it’s a zero-cost opt-in feature that may produce better results for some users. I will definitely try the EMA once it is merged and can report on further results. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
@patil-suraj a gentle ping. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
What does this PR do?
This PR enables a target model to be optionally used in the LCM-LoRA distillation scripts via the
--use_target_model
argument.Follow up to #6505.
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@patrickvonplaten
@patil-suraj
@sayakpaul
@jon-chuang
@shuminghu