The perplexity tool returns abnormal values #70

ppp-max · 2024-11-11T03:36:50Z

Hello，Sorry to bother you.

T tested the PPLs of llama.cpp and T-MAC is abnormal, which values are 110682 and 53515, so big. But we know that the normal value should be very small. So then I try to test the latest llama.cpp( https://github.com/ggerganov/llama.cpp,)'s PPL(about 6~9), which is nomal.

https://github.com/ggerganov/llama.cpp/blob/master/examples/perplexity/README.md

Have you tested the PPL data, or do it need to do additional processing on the PPL data?

Thank you for your assistance!

QingtaoLi1 · 2024-11-11T06:23:52Z

@ppp-max Which models are you testing? And do you check llama-cli to see whether the output tokens are normal?

Recently, we find that some EfficientQAT Llama-2-7b models has vocab_size=32001, but the meta/Llama-2-7b has vocab_size=32000; thus, the perplexity becomes abnormally high. After hacking and forcing it to be 32000 (removing the last one), we got correct PPL numbers. You can see our PR to llama.cpp for the numbers.

ppp-max · 2024-11-11T07:41:28Z

I used models are llama-2-7b-chat.Q4_0.gguf and llama-2-7b-chat.Q2_K.gguf, which downloaded from https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF.
And I getted different PPL when testing the same gguf with different llama.cpp (https://github.com/ggerganov/llama.cpp and https://github.com/kaleid-liner/llama.cpp).

And how to hack and force vocab_size to 32000? Thanks.

QingtaoLi1 · 2024-11-12T05:51:53Z

I used models are llama-2-7b-chat.Q4_0.gguf and llama-2-7b-chat.Q2_K.gguf, which downloaded from https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF. And I getted different PPL when testing the same gguf with different llama.cpp (https://github.com/ggerganov/llama.cpp and https://github.com/kaleid-liner/llama.cpp).

And how to hack and force vocab_size to 32000? Thanks.

@ppp-max I think it's better to use non-chat version of models to test PPL. From our test, chat version will give slightly higher PPL numbers, but still below 10. We've tested a Q4_0 model (downloaded from meta/Llama-2-7b and quantized using llama-quantize), in which the origin llama.cpp, kaleid-liner llama.cpp and T-MAC got almost the same PPL (5.961764, 5.962298, 5.962719).

For the vocab_size problem, have you checked the llama-cli output tokens? If the output are random tokens instead of human sentences, probably you should firstly check other parts, e.g. the configuration, build, command options, etc. If the generated tokens are normal, you can check model.vocab.n_vocab or model.hparams.n_vocab or the weight tensor shapes after loading the model to see if the problem is indeed vocab_size.

QingtaoLi1 · 2024-11-19T10:24:23Z

@ppp-max I notice that your issue #61 mentioned that you used Llama-2-7b-EfficientQAT-w2g128-GPTQ and Llama-2-7b-EfficientQAT-w4g128-GPTQ. They are where I find the vocab size problem.

My hacking is quite tricky and temporary, so tbh I don't wanna put it here. But you can use it as a temprary solution like me. I forcely set model.hparams.n_vocab and model.vocab.n_vocab to be 32000 after loading model hparams and vocab, and resize model.vocab.id_to_token to 32000. And then when reading tensor info in ggml.c, change the tensor shape if (info->ne[j] == 32001) { info->ne[j] = 32000; }

Hope these can help you.

QingtaoLi1 mentioned this issue Nov 19, 2024

The perplexity tool returns unexpected ppl results #65

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The perplexity tool returns abnormal values #70

The perplexity tool returns abnormal values #70

ppp-max commented Nov 11, 2024

QingtaoLi1 commented Nov 11, 2024 •

edited

Loading

ppp-max commented Nov 11, 2024

QingtaoLi1 commented Nov 12, 2024 •

edited

Loading

QingtaoLi1 commented Nov 19, 2024

The perplexity tool returns abnormal values #70

The perplexity tool returns abnormal values #70

Comments

ppp-max commented Nov 11, 2024

QingtaoLi1 commented Nov 11, 2024 • edited Loading

ppp-max commented Nov 11, 2024

QingtaoLi1 commented Nov 12, 2024 • edited Loading

QingtaoLi1 commented Nov 19, 2024

QingtaoLi1 commented Nov 11, 2024 •

edited

Loading

QingtaoLi1 commented Nov 12, 2024 •

edited

Loading