Replies: 1 comment 1 reply
-
@sayakpaul the code here is correct wrt to preprocessing, I matched what the official code impl did. Any discrepencies with the paper is a separate issue. Both are the same [-1, 1] |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
As far as I know the BiT and ViT (prefixed with jx) models were converted from the official checkpoints released by Google. So, I wanted to know do BiT and ViT follow the same preprocessing steps as the official implementations do during inference? Generally, we use the respective mean and std to carry out normalization but that might not be the case here.
For example, BiT just scales the pixel values to [0, 1] range (source). ViT scales to [-1, 1] (source).
A clarification would be helpful.
Beta Was this translation helpful? Give feedback.
All reactions