Open
Description
The current API is great for producing a text response, but if we could provide an option that gave us the logprobs for each streamed token, we'd be able to implement a lot more functionality on top of the model such as basic guidance, estimating confidence levels, collecting multiple branches of output more efficiently, custom token heuristics instead of the built-in temperature/topK (I saw there was another proposal to add a seed option, but this would let you build that yourself), and more.
Basically it could be modeled from something like the top_logprobs
parameter that the OpenAI API has which would return something like this for top_logprobs=2
:
{
"logprobs": {
"content": [
{
"token": "Hello",
"logprob": -0.31725305,
"top_logprobs": [
{
"token": "Hello",
"logprob": -0.31725305
},
{
"token": "Hi",
"logprob": -1.3190403
}
]
},
{
"token": "!",
"logprob": -0.02380986,
"top_logprobs": [
{
"token": "!",
"logprob": -0.02380986
},
{
"token": " there",
"logprob": -3.787621
}
]
},
{
"token": " How",
"logprob": -0.000054669687,
"top_logprobs": [
{
"token": " How",
"logprob": -0.000054669687
},
{
"token": "<|end|>",
"logprob": -10.953937
}
]
},
// etc