Detail: https://arxiv.org/pdf/2402.19427.pdf PR from gemma.cpp: https://github.com/google/gemma.cpp/pull/136