Regarding the preprocessing steps for BiT and ViT models inside timm #591

sayakpaul · 2021-04-27T17:42:47Z

sayakpaul
Apr 27, 2021

As far as I know the BiT and ViT (prefixed with jx) models were converted from the official checkpoints released by Google. So, I wanted to know do BiT and ViT follow the same preprocessing steps as the official implementations do during inference? Generally, we use the respective mean and std to carry out normalization but that might not be the case here.

For example, BiT just scales the pixel values to [0, 1] range (source). ViT scales to [-1, 1] (source).

A clarification would be helpful.

rwightman · 2021-04-27T17:58:44Z

rwightman
Apr 27, 2021
Maintainer

@sayakpaul the code here is correct wrt to preprocessing, I matched what the official code impl did. Any discrepencies with the paper is a separate issue. Both are the same [-1, 1]

1 reply

sayakpaul Apr 27, 2021
Author

I see. Shouldn't the BiT ones be [0, 1]?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regarding the preprocessing steps for BiT and ViT models inside timm #591

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Regarding the preprocessing steps for BiT and ViT models inside timm #591

sayakpaul Apr 27, 2021

Replies: 1 comment · 1 reply

rwightman Apr 27, 2021 Maintainer

sayakpaul Apr 27, 2021 Author

sayakpaul
Apr 27, 2021

Replies: 1 comment 1 reply

rwightman
Apr 27, 2021
Maintainer

sayakpaul Apr 27, 2021
Author