Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance on mobile phone such as MTK D9000/D8300 or Qualcomm 8Gen3 #39

Open
yuimo opened this issue Sep 4, 2024 · 2 comments
Open
Labels
duplicate This issue or pull request already exists

Comments

@yuimo
Copy link

yuimo commented Sep 4, 2024

hi, can you share some performance data on MTK or Qualcomm chips?
such as QWen or Gemma model's prefill and decode speed?
thanks very much.

@kaleid-liner kaleid-liner added the duplicate This issue or pull request already exists label Sep 4, 2024
@kaleid-liner
Copy link
Collaborator

Check #32 (comment)

@kaleid-liner
Copy link
Collaborator

And the performance will be further optimized by up to 1.5x soon by merging the latest llama.cpp. Refer to #32 (comment) for more info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

2 participants