Skip to content

Misc. bug: llama-server assistant prefill only works when message content is a string (not a list of objects) #14353

Open
@find0x90

Description

@find0x90

Name and Version

.\llama-server --version
...
version: 5747 (0142961a)
built with clang version 18.1.8 for x86_64-pc-windows-msvc

Operating systems

Windows

Which llama.cpp modules do you know to be affected?

llama-server

Command line

llama-server.exe --host 0.0.0.0 --port 1234 --flash-attn --no-warmup --model ~\llm\models\google\gemma-3-27b-it-qat-q4_0-gguf\gemma-3-27b-it-q4_0.gguf --mmproj ~\llm\models\google\gemma-3-27b-it-qat-q4_0-gguf\mmproj-model-f16-27B.gguf --gpu-layers 63 --temp 1.0 --repeat-penalty 1.0 --min-p 0.01 --top-k 64  --top-p 0.95 --cache-type-k q8_0 --cache-type-v q8_0 --ctx-size 16384

Problem description & steps to reproduce

See the below example where I send the same message in two different ways, first with "content" as a list of objects and then as a string. In the first case, the server does not continue the assistant message. In the second case, when the "content" field is a string, the server continues the assistant message as expected.

[user@88947fa9b7b9 ~]$ curl http://localhost:1234/v1/chat/completions -s --json '{
    "model": "Gemma3-27B",
    "messages": [
        {
            "role": "user",
            "content": "Hello!"
        },
        {
            "role": "assistant",
            "content": [
                {
                    "type": "text",
                    "text": "Ahoy-"
                }
            ]
        }
    ]
}' | jq '.choices[0].message.content'
"Hello to you too! 👋 \n\nIt's nice to meet you (virtually, of course!). How can I help you today? Do you have any questions, need some ideas, or just want to chat? \n\nLet me know what's on your mind! 😊\n"
[user@88947fa9b7b9 ~]$ curl http://localhost:1234/v1/chat/completions -s --json '{
    "model": "Gemma3-27B",
    "messages": [
        {
            "role": "user",
            "content": "Hello!"
        },
        {
            "role": "assistant",
            "content": "Ahoy-"
        }
    ]
}' | jq '.choices[0].message.content'
"hoy! 👋 \n\nHello to *you* too! How can I help you today? Are you looking to:\n\n* **Chat?** Just want someone to talk to?\n* **Brainstorm ideas?** \n* **Get information?** (I can try my best to answer your questions!)\n* **Write something?** (Stories, poems, code, etc.)\n* **Something else?**\n\nLet me know what's on your mind! 😊"

First Bad Commit

No response

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions