You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It uses 32 integer points for activation and softmax.
However, the self-attention result cannot exceed 26 bits (8 bits x 8 bits x 10 bits (768 channels)).
I want to try the result with 16-bit precision (quantized with 16-bit and softmax and GeRU algorithms).
Is 16 bit any problem?
If not, I want a way to implement this.
The text was updated successfully, but these errors were encountered:
Thanks for the great work.
It uses 32 integer points for activation and softmax.
However, the self-attention result cannot exceed 26 bits (8 bits x 8 bits x 10 bits (768 channels)).
I want to try the result with 16-bit precision (quantized with 16-bit and softmax and GeRU algorithms).
Is 16 bit any problem?
If not, I want a way to implement this.
The text was updated successfully, but these errors were encountered: