llama-server: `--system-prompt-file` unavailable #16898

d-a-v · 2025-10-31T15:40:19Z

d-a-v
Oct 31, 2025

The option --system-prompt-file is available with llama-cli but not with llama-server.

What would be the way to get such a feature with the current available options in llama-server ?
I tried to get and update the chat-template from model with no luck.

Is it worth asking for a feature request ?

CISC · 2025-10-31T21:06:51Z

CISC
Oct 31, 2025
Collaborator

The --system-prompt option is not available for llama-server either.

I don't think it makes much sense to have it as you either use what the model and/or chat template defaults to, or set it in the UI, alternatively pass it through the API.

0 replies

d-a-v · 2025-10-31T22:05:23Z

d-a-v
Oct 31, 2025
Author

I understand the point about using the web UI or curl for ad-hoc prompts, but that doesn't fully cover certain specialized use cases. Template defaults is indeed a way to achieve my goal but not generic enough - or a bit complex to modify since models carry their own.

For instance, with large-context LLMs, one could configure an easy-to-use for everybody web-llama-server to share local knowledge and turn it into a dedicated chatbot for a local community. In such scenarios, preloading a system prompt—depending on its size—could be highly efficient and practical for this tailored, always-on setup.

Would it be worth reconsidering the --system-prompt-file option for these kinds of deployments?

0 replies

d-a-v · 2025-11-05T16:20:20Z

d-a-v
Nov 5, 2025
Author

I successfully preloaded a huge system prompt (260kB txt -> ~100kT) using Qwen3 (ctx:256kT).
In this way this instance has been made into a chatbot for anybody connecting to it.
The provided prompt is swallowed only once for everybody.

Here's the changes, I have some questions for polishing it:

Would there be a more straightforward way to achieve the same goal ?
Would it be acceptable as it is ?
Is there a way to get tags <|im_start|>system\n ... <|im_end|>\n extracted from the chat template (I couldn't find out yet)

https://github.com/ggml-org/llama.cpp/compare/master...d-a-v:llama.cpp:serversystemprompt?expand=1

llama-server --system-prompt-file large-prompt.txt --model models/model.gguf ...

1 reply

CISC Nov 5, 2025
Collaborator

* Would there be a more straightforward way to achieve the same goal ?

Most likely.

* Would it be acceptable as it is ?

No, you need to fill in the system prompt part of the chat completion instead so it either uses the jinja chat template or the chosen chat format. Though, for your purposes I'd imagine you would have to append any user provided system prompt to the one provided at startup. I really don't think that would be acceptable.

* Is there a way to get tags `<|im_start|>system\n` ... `<|im_end|>\n` extracted from the chat template  (I couldn't find out yet)

Not really, you could render a partial chat and figure it out, but that would be the wrong way to go anyway.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

llama-server: `--system-prompt-file` unavailable #16898

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

llama-server: --system-prompt-file unavailable #16898

Uh oh!

d-a-v Oct 31, 2025

Replies: 3 comments · 1 reply

Uh oh!

CISC Oct 31, 2025 Collaborator

Uh oh!

d-a-v Oct 31, 2025 Author

Uh oh!

d-a-v Nov 5, 2025 Author

Uh oh!

CISC Nov 5, 2025 Collaborator

llama-server: `--system-prompt-file` unavailable #16898

d-a-v
Oct 31, 2025

Replies: 3 comments 1 reply

CISC
Oct 31, 2025
Collaborator

d-a-v
Oct 31, 2025
Author

d-a-v
Nov 5, 2025
Author

CISC Nov 5, 2025
Collaborator