-
Hi, I just came across the VIT checkpoint used in this repo, as it is indicated in the results here: vit_base_patch16_384 achieves an 86.006 vs 84.15 reported in the paper Table 5: There is a large gap between these two reported scores, I understand some of the improvements can be achieved by training with a longer scheme and better hyper-parameter choice. But since the +2 point boost is somewhat huge, so I wonder what could cause this huge performance boost. Is there some special technique applied in this? Where can I found the corresponding config file, checkpoint, and training details for the scores reported in the results-imagenet.csv? Many thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
@jojo23333 the checkpoints were updated with the best options from this paper that I was a part of, More augmenation and regularization was used w/ the 21k pretraining, and a search over both those and the transfer hparams was performed |
Beta Was this translation helpful? Give feedback.
@jojo23333 the checkpoints were updated with the best options from this paper that I was a part of,
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
-- https://arxiv.org/abs/2106.10270More augmenation and regularization was used w/ the 21k pretraining, and a search over both those and the transfer hparams was performed