Skip to content

Conversation

sven-knoblauch
Copy link
Contributor

@sven-knoblauch sven-knoblauch commented Oct 9, 2024

small changes to add the lora modules ENV variable (supporting 1 lora adapter) as solution for #119
with the format: {"name": "xxx", "path": "xxx/xxxxx", "base_model_name": "xxx/xxxx"}

also change of the v1/models endpoint to return all models

@sven-knoblauch sven-knoblauch marked this pull request as draft October 9, 2024 11:42
@sven-knoblauch sven-knoblauch marked this pull request as ready for review October 9, 2024 12:53
@pandyamarut
Copy link
Collaborator

Thanks for the PR. @sven-knoblauch . Can you please also add how did you test the PR?

@sven-knoblauch
Copy link
Contributor Author

I made a docker container with the given Dockerfile (on dockerhub: svenknob/runpod-vllm-worker) and tested it on runpod serverless. Worked with a custom trained lora adapter (added in the runpod GUI as ENV variable: LORA_MODULES) with an awq mistral model. The lora adapter is also visible in the v1/models endpoint.

@pandyamarut pandyamarut merged commit 6e8696c into runpod-workers:main Oct 31, 2024
@nerdylive123
Copy link

Hi, is there a documentation for this env usage? in the markdown perhaps?

@nielsrolf
Copy link

Is a docker image publicly available that contains this PR?

@sven-knoblauch
Copy link
Contributor Author

Added a pull request for changing the readme #130.
Usage is similar to the usage in the "original" vllm server. The env var name is LORA_MODULES and the format is {"name": "xxx", "path": "xxx/xxxx", "base_model_name": "xxx/xxxx"}, where the name is the name the http requests are made for, the path is the huggingface path of the adapter and the base_model_name is the modelname it is trained on.

For now you can use my docker image svenknob/runpod-vllm-worker, till it has been published.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants