Add optional “author-recommended sampling defaults” (temperature, top_p, top_k, etc.) to GGUF metadata and llama.cpp runtime #17088
Replies: 1 comment 1 reply
-
|
Not a bad idea actually, especially for those who want to get the model running as quickly as possible without fiddling with the settings manually. I can envision a few key-value pairs being introduced into GGUF:
And conceptually, during initialisation, Would you like me to convert this to an enhancement request to see if anyone is interested in this? :) |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
first of all — thanks a lot for all the work on llama.cpp!
I’m just getting started with it after using Ollama for local models, and I really like how flexible llama.cpp is.
One thing I noticed though: many models on Hugging Face mention “recommended” generation settings (like temperature, top_p, top_k, etc.), and Ollama’s Modelfile system can automatically apply those when you load a model.
In llama.cpp, I have to set those values manually each time (for example with
--temp,--top-p,--top-k, etc.).Would it be possible (or make sense) for GGUF files to include these “default / recommended” values from the model author and for llama.cpp to optionally use them automatically when loading a model?
I think that would make it easier for new users to get the same output quality the model authors intended, without having to guess or look up those settings.
Thanks again for maintaining this awesome project!
I just wanted to share the idea in case it fits into your plans for GGUF or llama.cpp in the future.
Beta Was this translation helpful? Give feedback.
All reactions