performance on mobile phone such as MTK D9000/D8300 or Qualcomm 8Gen3 #39

yuimo · 2024-09-04T09:25:50Z

hi, can you share some performance data on MTK or Qualcomm chips?
such as QWen or Gemma model's prefill and decode speed?
thanks very much.

kaleid-liner · 2024-09-04T15:29:57Z

kaleid-liner · 2024-09-04T15:36:20Z

And the performance will be further optimized by up to 1.5x soon by merging the latest llama.cpp. Refer to #32 (comment) for more info.

kaleid-liner added the duplicate This issue or pull request already exists label Sep 4, 2024

Provide feedback