-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The perplexity tool returns abnormal values #70
Comments
@ppp-max Which models are you testing? And do you check llama-cli to see whether the output tokens are normal? Recently, we find that some EfficientQAT Llama-2-7b models has vocab_size=32001, but the meta/Llama-2-7b has vocab_size=32000; thus, the perplexity becomes abnormally high. After hacking and forcing it to be 32000 (removing the last one), we got correct PPL numbers. You can see our PR to llama.cpp for the numbers. |
I used models are llama-2-7b-chat.Q4_0.gguf and llama-2-7b-chat.Q2_K.gguf, which downloaded from https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF. And how to hack and force vocab_size to 32000? Thanks. |
@ppp-max I think it's better to use non-chat version of models to test PPL. From our test, chat version will give slightly higher PPL numbers, but still below 10. We've tested a Q4_0 model (downloaded from meta/Llama-2-7b and quantized using llama-quantize), in which the origin llama.cpp, kaleid-liner llama.cpp and T-MAC got almost the same PPL (5.961764, 5.962298, 5.962719). For the vocab_size problem, have you checked the llama-cli output tokens? If the output are random tokens instead of human sentences, probably you should firstly check other parts, e.g. the configuration, build, command options, etc. If the generated tokens are normal, you can check |
@ppp-max I notice that your issue #61 mentioned that you used Llama-2-7b-EfficientQAT-w2g128-GPTQ and Llama-2-7b-EfficientQAT-w4g128-GPTQ. They are where I find the vocab size problem. My hacking is quite tricky and temporary, so tbh I don't wanna put it here. But you can use it as a temprary solution like me. I forcely set Hope these can help you. |
Hello,Sorry to bother you.
T tested the PPLs of llama.cpp and T-MAC is abnormal, which values are 110682 and 53515, so big. But we know that the normal value should be very small. So then I try to test the latest llama.cpp( https://github.com/ggerganov/llama.cpp,)'s PPL(about 6~9), which is nomal.
https://github.com/ggerganov/llama.cpp/blob/master/examples/perplexity/README.md
Have you tested the PPL data, or do it need to do additional processing on the PPL data?
Thank you for your assistance!
The text was updated successfully, but these errors were encountered: