Reproducing ViT results fine tune imagenet 1K #1174

shairoz-deci · 2022-03-14T13:00:53Z

shairoz-deci
Mar 14, 2022

I'm trying to train ViT_base_16 on imagenet 1K from the imagenet21K pretraining checkpoint and can't reproduce results.
The recipe from the paper should be batch=512, 20K steps, 500 warmup steps, lr=0.01, cosine scheduler and grad_clip of 1 which is similar to running:

train.py /data/Imagenet --model vit_base_patch16_224_in21k --sched cosine --pretrained --epochs 9 --warmup-epochs 1 --lr 0.01 --batch-size 64 -j 8 --clip-grad 1 --cooldown-epochs 0

on 8 gpus with the exception of the warmup which I resolved locally.
The pretrained checkpoint for vit_base_patch16_224 reaches an accuracy of 84.4 while the training above reaches 83.65

Did anyone manage to reproduce the reported accuracy of ViT?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Reproducing ViT results fine tune imagenet 1K #1174

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

Reproducing ViT results fine tune imagenet 1K #1174

Uh oh!

shairoz-deci Mar 14, 2022

Replies: 0 comments

shairoz-deci
Mar 14, 2022