Skip to content

If you add opus as a backend via api, then add a model via the api that should use llama.cpp or vllm, it then tries to use opus #9287

@JohnGalt1717

Description

@JohnGalt1717

LocalAI version:
v4.1.3 (fdc9f7b)

Environment, CPU architecture, OS, and Version:
Docker, X64, Ubuntu 24.04

Describe the bug
(I'm scripting an install to deploy to multiple people.)
If you curl to add opus which is required for realtime and then opus loads and then you then script any llm model you want that would normally use llama.cpp it will then try and use opus instead of being specified as llama.cpp and thus inducing a download of the required backend

To Reproduce

        payload="{\"id\":\"opus\"}"
	response="$(curl -fsS -X POST "${LOCALAI_BASE_URL}/backends/apply" \
		"${AUTH_HEADER[@]}" \
		-H "Content-Type: application/json" \
		-d "${payload}")"

        payload="{\"id\":\"Qwen3.5-9b\"}"
	response="$(curl -fsS -X POST "${LOCALAI_BASE_URL}/models/apply" \
		"${AUTH_HEADER[@]}" \
		-H "Content-Type: application/json" \
		-d "${payload}")"


Expected behavior
this should induce the model to use the proper backend that is in the default configuration.

One would also expect that if the model allowed multiple backends (i.e. llama.cpp and vllm-omni) that you'd be able to choose the backend as part of the payload to the request.

Logs
N/A.

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions