Getting OpenAI compatible chat completion endpoint working with docker image #3973

cduk · 2024-04-10T14:39:53Z

cduk
Apr 10, 2024

I try to start an OpenAI compatible server using the command:

sudo docker run --runtime nvidia --gpus all -v /root/.cache/huggingface -p 18888:18888 vllm/vllm-openai --model TheBloke/openchat-3.5-0106-AWQ --host 0.0.0.0 --enforce-eager --port 18888

But when I try to make a request, I get the error: "POST /v1/chat/completions HTTP/1.1" 404 Not Found

Doing wget localhost:18888/v1/models works and I get: "GET /v1/models HTTP/1.1" 200 OK

If I run /usr/bin/python3 -m ochat.serving.openai_api_server --model TheBloke/openchat-3.5-0106-AWQ --host 0.0.0.0

The requests works. I wonder if I'm making mistake in how I use docker and whether the OpenAI endpoint is being set up or some other non-compatible API?

cduk · 2024-04-10T14:52:51Z

cduk
Apr 10, 2024
Author

I solved the problem, there was an difference in the model name when running locally vs hugging face.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting OpenAI compatible chat completion endpoint working with docker image #3973

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Getting OpenAI compatible chat completion endpoint working with docker image #3973

cduk Apr 10, 2024

Replies: 1 comment

cduk Apr 10, 2024 Author

cduk
Apr 10, 2024

cduk
Apr 10, 2024
Author