Question about ViT models and their organization #524

sayakpaul · 2021-03-26T09:14:25Z

sayakpaul
Mar 26, 2021

Hi.

Does vit_large_patch16_384 mean it was pre-trained on ImageNet-21k and fine-tuned on ImageNet-1k?

Answered by rwightman

Mar 27, 2021

@sayakpaul

Models that...

start with vit_ and end with _in21k were trained on imagenet-21k and not fine-tuned, their classification heads were zero'd by google researchers so they don't work for 21k but can be fine tuned for other tasks (they have weights for the pre-logits that other models don't). They are always 224x224.
start with vit_ and have jx_ in the beginning of their weights were also trained by google and they are the ones that were pretrained on ImageNet-21k and fine-tuned on ImageNet-1k. They are 224 or 384.
start with deit_ are the FB trained models that were trained on ImageNet-1k w and w/o distillation (based on model name)
there is one model (my small variant) that was …

View full answer

rwightman · 2021-03-27T23:38:54Z

rwightman
Mar 27, 2021
Maintainer

@sayakpaul

Models that...

start with vit_ and end with _in21k were trained on imagenet-21k and not fine-tuned, their classification heads were zero'd by google researchers so they don't work for 21k but can be fine tuned for other tasks (they have weights for the pre-logits that other models don't). They are always 224x224.
start with vit_ and have jx_ in the beginning of their weights were also trained by google and they are the ones that were pretrained on ImageNet-21k and fine-tuned on ImageNet-1k. They are 224 or 384.
start with deit_ are the FB trained models that were trained on ImageNet-1k w and w/o distillation (based on model name)
there is one model (my small variant) that was trained by me on ImageNet-1k

2 replies

sayakpaul Mar 28, 2021
Author

Really helps. Thank you, @rwightman! However, I don't see any model weights that have jx_ under timm.list_models(). Am I missing out on something?

NielsRogge Mar 30, 2021

@sayakpaul he means the ones that have jx_ in the URL of the weights, for example the model vit_base_patch16_224 has the URL https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_224-80ecf9dd.pth, which means this one was officially trained by Google

mwillwork · 2022-05-31T13:17:34Z

mwillwork
May 31, 2022

Which are the ViTs that are only trained on ImageNet 1K? Of these:
'vit_base_patch16_224',
'vit_base_patch16_224_dino',
'vit_base_patch16_224_in21k',
'vit_base_patch16_224_miil',
'vit_base_patch16_224_miil_in21k',
'vit_base_patch16_224_sam',

This one is just 1K right (but is not from Google): 'vit_base_patch16_224_miil'. As above, 'vit_base_patch16_224' this is actually 21K finetuned on 1K?

2 replies

rwightman May 31, 2022
Maintainer

The default weights vit weights are 21k pretrained, 1k fintuned, yes. The non-distilled deit models are 1k only without any finetune (or distill from larger models), same architecture as vit.

you can also directly load the non-native .npz checkpoint from the 'How to Train your ViT' aug-reg search https://colab.research.google.com/github/google-research/vision_transformer/blob/main/vit_jax_augreg.ipynb#scrollTo=MI_XTt1vX930 using the load_pretrained member fn, there are several 1k only weights for each S/32, S/16, B/32, B/16, B/8, L/16 and I think a Ti and some hybrids.

mwillwork Jun 1, 2022

Got it - thanks much for the quick response. The ones from the colab notebook look promising for what I'm trying to do (which is compare pretrained models on ImageNet 1K, 21K and the FT version on a new dataset).

Uh oh!

Question about ViT models and their organization #524

Uh oh!

sayakpaul Mar 26, 2021

Replies: 2 comments · 4 replies

Uh oh!

rwightman Mar 27, 2021 Maintainer

Uh oh!

Uh oh!

sayakpaul Mar 28, 2021 Author

Uh oh!

Uh oh!

NielsRogge Mar 30, 2021

Uh oh!

Uh oh!

mwillwork May 31, 2022

Uh oh!

rwightman May 31, 2022 Maintainer

Uh oh!

mwillwork Jun 1, 2022

sayakpaul
Mar 26, 2021

Replies: 2 comments 4 replies

rwightman
Mar 27, 2021
Maintainer

sayakpaul Mar 28, 2021
Author

mwillwork
May 31, 2022

rwightman May 31, 2022
Maintainer