Another setting for quantization #17

gksruf · 2021-10-19T14:10:53Z

Thanks for the great work.

It uses 32 integer points for activation and softmax.

However, the self-attention result cannot exceed 26 bits (8 bits x 8 bits x 10 bits (768 channels)).

I want to try the result with 16-bit precision (quantized with 16-bit and softmax and GeRU algorithms).
Is 16 bit any problem?
If not, I want a way to implement this.

gksruf added the question Further information is requested label Oct 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Another setting for quantization #17

Another setting for quantization #17

gksruf commented Oct 19, 2021

Another setting for quantization #17

Another setting for quantization #17

Comments

gksruf commented Oct 19, 2021