[FEATURE] Where can I find tyhe default training configuration for each model #1672

XixuHu · 2023-02-13T08:02:59Z

XixuHu
Feb 13, 2023

Hi, I would like to know where could I find the default training recipe for different Transformer models like ViT, DeiT, ConViT, SwinViT, XCiT, such as the training epochs, learning rate, learning scheduler and other parameters. :)

iminfine · 2023-03-10T02:16:06Z

iminfine
Mar 10, 2023

Face the same problem. I am concerning the reproducibility of these large models and thus would like to use the official implementations for research.

0 replies

TorbenSDJohansen · 2023-03-13T12:33:30Z

TorbenSDJohansen
Mar 13, 2023

Do note that most of the models available in timm are not trained using this repository (though many rely on this repository to some extend, see, e.g., deit). As such, the training configurations to reproduce the results of a model using this repository do not in general exist.

However, if you browse the source code for the models in this repository, you will find references to the papers and original code implementations. For example, if you want to access the code and paper for the original ViT paper, you can check out models/vision_transformer.py. This should give you all the information you need to reproduce the results of the respective papers.

Further, you can find a number of configurations for this repository to reproduce results across many models at rigthman's gists.

3 replies

luckyhug Mar 22, 2023

Thank you very much!! I'm also curious about the fine-tuning recipes, do you have any ideas about which models are better for fine-tuning? good starting configs for finetuning different models, or some recommended heads for finetuning some specific models?

TorbenSDJohansen Mar 23, 2023

No problems - glad to help! My best suggestion would be to look at how the authors of different models perform fine-tuning. For example, the authors of DeiT III lists the hyperparameters they use on their GitHub-page. Otherwise looking at the paper (which as I mentioned you can find browsing the source code for the models in this repository) is a great help, see, e.g., for DeiT III or EfficientNetV2. Most of the time it is quite easy to then use this repository to select the matching hyperparameters.

However, take this with a grain of salt -- I would imagine the optimal hyperparameters to use for fine-tuning often depend on the domain you want to transfer learn to. Probably @rwightman will know much more about this than I.

luckyhug Mar 23, 2023

Thank you so much for your kind directions!!! I will look at more papers and repos to try to find good hyperparameters for finetuning and transfer learning to medical images. Also Big thanks to @rwightman for this wonderful repo!!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Where can I find tyhe default training configuration for each model #1672

{{title}}

Replies: 2 comments 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

[FEATURE] Where can I find tyhe default training configuration for each model #1672

XixuHu Feb 13, 2023

Replies: 2 comments · 3 replies

iminfine Mar 10, 2023

TorbenSDJohansen Mar 13, 2023

luckyhug Mar 22, 2023

TorbenSDJohansen Mar 23, 2023

luckyhug Mar 23, 2023

XixuHu
Feb 13, 2023

Replies: 2 comments 3 replies

iminfine
Mar 10, 2023

TorbenSDJohansen
Mar 13, 2023