train_dreambooth_lora_sdxl.py produces zoomed/cropped images #7327

SylwiaNowakowska · 2024-03-14T14:53:33Z

SylwiaNowakowska
Mar 14, 2024

Hi,

I am using train_dreambooth_lora_sdxl.py (version from 28.02: 7db935a) to generate mammography images.

accelerate launch train_dreambooth_lora_sdxl.py
--pretrained_model_name_or_path='stabilityai/stable-diffusion-xl-base-1.0'
--pretrained_vae_model_name_or_path="madebyollin/sdxl-vae-fp16-fix"
--cache_dir='.../Project/cache_dir'
--dataset_name='.../Project/DATASET'
--image_column="image"
--caption_column="text"
--repeats=1
--instance_prompt="In the style of MaGHY"
--validation_prompt="In the style of MaGHY, a MLO mammogram."
--num_validation_images=4
--validation_epochs=1
--output_dir='.../Project/OUTPUT/03_RUN'
--seed=42
--resolution=1024
--train_text_encoder
--train_batch_size=1
--sample_batch_size=1
--max_train_steps=2000
--checkpointing_steps=100
--checkpoints_total_limit=100
--gradient_accumulation_steps=5
--gradient_checkpointing
--learning_rate=2e-04
--text_encoder_lr=5e-6
--lr_scheduler="constant"
--snr_gamma=5.0
--lr_warmup_steps=500
--lr_num_cycles=1
--lr_power=1.0
--dataloader_num_workers=0
--optimizer="AdamW"
--adam_beta1=0.9
--adam_beta2=0.999
--adam_weight_decay=1e-04
--adam_weight_decay_text_encoder=1e-03
--adam_epsilon=1e-08
--max_grad_norm=1.0
--report_to=wandb
--mixed_precision="fp16"
--prior_generation_precision="fp16"
--local_rank=-1
--use_8bit_adam
--rank=4

All images in the training set:

are resized already to 1024 x 1024
are flipped to left: so there is a lot of black background on the right side (see example below)

The issue is that the generated images are cropped/zoomed:

I tried following things:

Adding to DDPM scheduler clip_sample=False
clip_sample (bool, defaults to True) — Clip the predicted sample for numerical stability.
Setting the crops_coords_top_left during inference to (0,0):
output = pipeline(prompt=prompt, crops_coords_top_left=(0,0)).images[0]
Including the information about the background in the prompt during inference:
"In the style of MaGHY, a MLO mammogram with black background on the right side."

These things were consistently resulting in an zoomed/cropped images.

I would be grateful, if you would have ideas, what else I could try.

asomoza · 2024-03-14T15:30:45Z

asomoza
Mar 14, 2024
Maintainer

These are just some ideas that comes to mind. I haven't trained with black backgrounds but speaking in tensor math, the black are 0s, so probably is learning that the black background is something that you don't need, I could be wrong about this but maybe you can add a very small amount of brightness so that it thinks there's data.

About the zoom and cropping, you can also teach the model those, so maybe add some zoomed and cropped images and caption them with that information, like zoomed 1.5x, 2x, 3x, 4x, without zoom and the same to the crop, this way when you generate you can put those as negatives or positive to have more control over the generation.

1 reply

SylwiaNowakowska Mar 18, 2024
Author

Thx for the ideas, I will try them out :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

train_dreambooth_lora_sdxl.py produces zoomed/cropped images #7327

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

train_dreambooth_lora_sdxl.py produces zoomed/cropped images #7327

Uh oh!

SylwiaNowakowska Mar 14, 2024

Replies: 1 comment · 1 reply

Uh oh!

asomoza Mar 14, 2024 Maintainer

Uh oh!

SylwiaNowakowska Mar 18, 2024 Author

SylwiaNowakowska
Mar 14, 2024

Replies: 1 comment 1 reply

asomoza
Mar 14, 2024
Maintainer

SylwiaNowakowska Mar 18, 2024
Author