-
Notifications
You must be signed in to change notification settings - Fork 12.4k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Name and Version
version: 5327 (27ebfca)
built with cc (Debian 12.2.0-14) 12.2.0 for x86_64-linux-gnu
Operating systems
Linux
GGML backends
CUDA
Hardware
nvidia
Models
all
Problem description & steps to reproduce
After the user prompt is provided, the code enters this branch:
Line 716 in 0cf6725
LOG_DBG("embd_inp.size(): %d, n_consumed: %d\n", (int) embd_inp.size(), n_consumed); |
No new tokens are generated.
However, the following code assumes that there is a new token and it is inserted in the assistant response:
Line 824 in 0cf6725
assistant_ss << common_token_to_piece(ctx, id, false); |
First Bad Commit
No response
Relevant log output
The easiest way is to set a breakpoint here and wait for the assistant message:
https://github.com/ggml-org/llama.cpp/blob/0cf6725e9f9a164c39f7a87214d60342f7f946d8/tools/main/main.cpp#L270
teleprint-me
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working