-
Notifications
You must be signed in to change notification settings - Fork 13.8k
Open
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomers
Description
Prerequisites
- I am running the latest code. Mention the version if possible as well.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new and useful enhancement to share.
Feature Description
The ability load/unload and adjust the scale of cvectors via API, similar to the new LoRA scale/host-swap feature recently implmented:
POST /lora-adapters: Set list of LoRA adapters
To disable an adapter, either remove it from the list below, or set scale to 0.
Request format
To know the id of the adapter, use GET /lora-adapters
[
{"id": 0, "scale": 0.2},
{"id": 1, "scale": 0.8}
]
``
I read in the change log that this was inspired by cvector scaling which is already implemented, so would it be possible to expose this via the API as well?
### Motivation
During creative writing, I often use control-vectors to steer the responses of the AI, using a simple web ui with sliders to tweak the vector.
Currently, I've written a wrapper API/web ui with sliders for the different vectors so I can adjust them as needed.
However, after each change to the scaling, or toggling a cvector on/off, I have to restart the llama-server and reload the model.
If we could get this in the llama-server API instead, it would make cvectors useful for a lot of other people, and I could do away with the entire wrapper server I wrote.
### Possible Implementation
This could be exposed the same way LoRAs are right now
GET /cvectors
[
{
"id": 0,
"path": "language-ornate_vs_simple.gguf",
"scale": 0.7
},
{
"id": 1,
"path": "character-focus-naration_vs_dialogue.gguf",
"scale": 0.2
}
]
POST /cvectors
[
{"id": 0, "scale": 0.5},
{"id": 1, "scale": 0.5}
]
For reference, this is how they're called via command line at the moment:
--control-vector XXXXX-language__debias.gguf \
--control-vector-scaled XXXXX-language__ornate.gguf 0.20
lucas-bortoli
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomers