diffusers/examples/text_to_image/train_text_to_image.py training SD1.4 question #10138
Replies: 1 comment
-
How did your final training generate results? |
Beta Was this translation helpful? Give feedback.
-
How did your final training generate results? |
Beta Was this translation helpful? Give feedback.
-
I using the following script to train SD1.4 model:
export MODEL_NAME="/home/bingxing2/home/scx7kzs/CODE/MODEL/stable-diffusion-v1-4"
export TRAIN_DIR="/home/bingxing2/home/scx7kzs/CODE/DATA/allava_vflan/images"
export TRAIN_DIR="/home/bingxing2/home/scx7kzs/CODE/diffusers/ft_data/cc_sbu/train"
export OUTPUT_DIR="/home/bingxing2/home/scx7kzs/CODE/diffusers/ckpts"
accelerate launch /home/bingxing2/home/scx7kzs/CODE/diffusers/examples/text_to_image/train_text_to_image.py
--pretrained_model_name_or_path=$MODEL_NAME
--train_data_dir=$TRAIN_DIR
--use_ema
--resolution=512 --center_crop --random_flip
--train_batch_size=1
--gradient_accumulation_steps=4
--gradient_checkpointing
--mixed_precision="fp16"
--max_train_steps=15000
--learning_rate=1e-05
--max_grad_norm=1
--lr_scheduler="constant" --lr_warmup_steps=0
--output_dir=${OUTPUT_DIR}
However, I've been stuck here for more than 20 minutes:
12/06/2024 11:25:07 - INFO - main - ***** Running training *****
12/06/2024 11:25:07 - INFO - main - Num examples = 3439
12/06/2024 11:25:07 - INFO - main - Num Epochs = 18
12/06/2024 11:25:07 - INFO - main - Instantaneous batch size per device = 1
12/06/2024 11:25:07 - INFO - main - Total train batch size (w. parallel, distributed & accumulation) = 4
12/06/2024 11:25:07 - INFO - main - Gradient Accumulation steps = 4
12/06/2024 11:25:07 - INFO - main - Total optimization steps = 15000
Steps: 0%| | 0/15000 [00:00<?, ?it/s]
How can I slove it?
Beta Was this translation helpful? Give feedback.
All reactions