Skip to content

Fine-tuning Flow #873

Answered by rwightman
dalistarh asked this question in Q&A
Sep 17, 2021 · 1 comments · 2 replies
Discussion options

You must be logged in to vote
  1. There is research to suggest a relationship between (pre)training hyper params, augmentations, etc and how well the weights transfer to various tasks. Usually the fine-tuning aug + reg are held constant in these analysis though. One of the most extensive set of experiments here were for the ViT architectures in How to train your ViT? paper that I was involved with, there is a big spreadsheet with ~50k transfer weights (https://console.cloud.google.com/storage/browser/_details/vit_models/augreg/index.csv) with hparams for each pre-training weight listed, in1k vs in21k for pretraining, and a LR sweep for each transfer weights (but aug was low for transfer and fixed). I'd say it's a fairly…

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@dalistarh
Comment options

@rwightman
Comment options

Answer selected by dalistarh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
enhancement New feature or request
2 participants
Converted from issue

This discussion was converted from issue #872 on September 17, 2021 15:46.