Replies: 1 comment
-
Did you find a solution? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I want to use a custom patch_embedding layer for ViT or DeiT. Basically, I want to do video classification using ViT and treat each frame of the video as a patch for ViT. I use a pretrained backbone like EffNet to generate features from the patches and send the list of features to ViT. So, I need to skip the patch embedding layer entirely. Any idea how to do this? I want to keep rest of the pretrained weights but only change the patch embedding layer.
Also is there anyway to load pretrained weights changing the patch-size of number of patches? Tred to do so, but gives size mismatch error for pos_embed layer.
Beta Was this translation helpful? Give feedback.
All reactions