adding trust_remote_code argument for loading dataset in controlnet traininig example #10727

YanivDorGalron · 2025-02-05T14:59:46Z

changes mode to both the readme and the train_controlnet.py example

needed for the controlnet example with fusing/fill50k

trust_remote_code=True for fusing/fill50k

asomoza · 2025-02-10T18:54:28Z

Hi, thanks for your contribution, can you help me understand why is this needed? do you have a use case for enabling remote code inside a dataset?

Also it should be False by default not True, so only people that needs it and know what they're doing can use it and enable it.

YanivDorGalron · 2025-02-10T19:05:46Z

Hi,

Thank you for reviewing my pull request.

I followed the instructions in the README for training ControlNet with the fill50k circles dataset on my multi-GPU setup. During this process, I encountered a prompt asking if I trust_remote_code (actually multiple identical prompt for each gpu). Due to this multiprocessing, I couldn't respond 'yes' to the prompt which was the result of a remote code on the dataset side. Considering this example is specific to the fill50k dataset, I assumed setting trust_remote_code=True by default was appropriate.

asomoza · 2025-02-10T20:07:29Z

oh I see, I still haven't had the opportunity to train a ControlNet, so I didn't know that the demo and that dataset need the remote code, its weird that people didn't open this issue before, ccing @sayakpaul since I don't have enough experience with this yet.

sayakpaul · 2025-02-11T02:37:54Z

@YanivDorGalron thanks for your PR. Could you show us for which ControlNet checkpoint this is useful?

YanivDorGalron · 2025-02-11T12:54:09Z

Hi @sayakpaul , I run accelerate config default and right after that this command:

export MODEL_DIR="stable-diffusion-v1-5/stable-diffusion-v1-5"
export OUTPUT_DIR="."

accelerate launch train_controller.py \
    --pretrained_model_name_or_path=$MODEL_DIR \
    --output_dir=$OUTPUT_DIR \
    --dataset_name=fusing/fill56k \
    --resolution=512 \
    --learning_rate=1e-5 \
    --validation_image "../conditioning_image_1.png" "./conditioning_image_2.png" \
    --validation_prompt "red circle with blue background" "cyan circle with brown floral background" \
    --train_batch_size=4

sayakpaul · 2025-02-11T13:48:39Z

That checkpoint should be well supported. I don't understand why do we need trust_remote_code.

YanivDorGalron · 2025-02-11T13:58:59Z

The trusted_remote_code is needed for the fill50k dataset. The one that the example uses

sayakpaul · 2025-02-11T14:02:12Z

examples/controlnet/train_controlnet.py

        )
    else:
        if args.train_data_dir is not None:
            dataset = load_dataset(
                args.train_data_dir,
                cache_dir=args.cache_dir,
+                trust_remote_code=args.trust_remote_code


I guess we could condition this by checking if we're using the fill-50k dataset otherwise it looks like a bit security-harming.

We could set the default to be false but write the correct command in the readme. (which is speciific for fill50k). does that sound appropriate?

Sounds perfect to me! Thanks so much!

I have made the suggested modification.
now at every command where fill50k is used a new flag argument was added:

--dataset_name=fusing/fill50k \ --trust_remote_code \

in addition i made the needed modification for the relevant py files to include the argument
its a store_true argument therefore by default is False.

…de in the command. store_true argument therefore default is false.

sayakpaul

Thank you so much!

sayakpaul · 2025-02-12T03:14:43Z

@YanivDorGalron can you run make style && make quality?

HuggingFaceDocBuilderDev · 2025-02-12T03:17:06Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

YanivDorGalron · 2025-02-12T08:42:13Z

@YanivDorGalron can you run make style && make quality?

These commands modify files unrelated to this PR. Should I proceed with pushing those changes as well?

        modified:   examples/controlnet/train_controlnet_flux.py
        modified:   examples/research_projects/geodiff/geodiff_molecule_conformation.ipynb
        modified:   examples/research_projects/gligen/demo.ipynb

sayakpaul · 2025-02-12T09:16:38Z

You can only add git add examples/controlnet/train_controlnet_flux.py and commit the results

YanivDorGalron · 2025-02-12T10:15:47Z

done!

YanivDorGalron · 2025-02-17T11:13:06Z

@sayakpaul I noticed that the PR failed some tests. I'm not sure if the failures are related to my changes— is there anything I can do on my end?

github-actions · 2025-03-13T15:03:39Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

YanivDorGalron added 5 commits February 5, 2025 16:55

adding trust_remote_code variable for loading dataset

ced8f7b

needed for the controlnet example with fusing/fill50k

Update README.md

fbc3e70

trust_remote_code=True for fusing/fill50k

Merge branch 'main' into patch-4

12312f9

Merge branch 'main' into patch-4

a8c0d6d

Merge branch 'main' into patch-4

6b6c9cd

sayakpaul reviewed Feb 11, 2025

View reviewed changes

YanivDorGalron and others added 6 commits February 11, 2025 18:57

adding trust_remote_code as a store_true variable

839e9c8

changing all examples that use fill50k. now contain --trust_remote_co…

d5bea15

…de in the command. store_true argument therefore default is false.

Merge branch 'huggingface:main' into patch-4

aaaa341

adding the argument to load_dataset function call

d19e1d3

Merge remote-tracking branch 'origin/patch-4' into patch-4

b5dd061

Merge branch 'main' into patch-4

44f4b43

sayakpaul approved these changes Feb 12, 2025

View reviewed changes

YanivDorGalron and others added 2 commits February 12, 2025 11:18

running make style and quality

ed202ad

Merge branch 'main' into patch-4

3352782

sayakpaul and others added 2 commits February 12, 2025 16:07

updates

e8c3f07

Merge branch 'main' into patch-4

5e87f2d

YanivDorGalron requested a review from sayakpaul February 16, 2025 11:20

sayakpaul and others added 2 commits February 16, 2025 16:56

Merge branch 'main' into patch-4

4ead085

removing space after backslash. can lead to problems

6450b8e

Merge branch 'main' into patch-4

1050b2b

github-actions bot added the stale Issues that haven't received updates label Mar 13, 2025

adding trust_remote_code argument for loading dataset in controlnet traininig example #10727

Are you sure you want to change the base?

adding trust_remote_code argument for loading dataset in controlnet traininig example #10727

Uh oh!

Conversation

YanivDorGalron commented Feb 5, 2025

Uh oh!

asomoza commented Feb 10, 2025

Uh oh!

YanivDorGalron commented Feb 10, 2025

Uh oh!

asomoza commented Feb 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sayakpaul commented Feb 11, 2025

Uh oh!

YanivDorGalron commented Feb 11, 2025

Uh oh!

sayakpaul commented Feb 11, 2025

Uh oh!

YanivDorGalron commented Feb 11, 2025

Uh oh!

sayakpaul Feb 11, 2025

Choose a reason for hiding this comment

Uh oh!

YanivDorGalron Feb 11, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul Feb 11, 2025

Choose a reason for hiding this comment

Uh oh!

YanivDorGalron Feb 11, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul commented Feb 12, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Feb 12, 2025

Uh oh!

YanivDorGalron commented Feb 12, 2025

Uh oh!

sayakpaul commented Feb 12, 2025

Uh oh!

YanivDorGalron commented Feb 12, 2025

Uh oh!

YanivDorGalron commented Feb 17, 2025

Uh oh!

github-actions bot commented Mar 13, 2025

Uh oh!

Uh oh!

asomoza commented Feb 10, 2025 •

edited

Loading