Replies: 7 comments 1 reply
-
| I also want to know how can I upload images by using external tools to the server. | 
Beta Was this translation helpful? Give feedback.
-
| Thank you, I'm now using this shell script to describe images in my file manager. #!/bin/bash
# Check if image file argument is provided
if [ $# -eq 0 ]; then
   echo "Usage: $0 <image-file>"
   exit 1
fi
IMAGE_FILE="$1"
API_URL="http://192.168.1.68:8080/v1/chat/completions"
MODEL="llava"
OUTPUT_FILE="${IMAGE_FILE}.txt"  # Adds .txt to the original filename (e.g., image.jpg.txt)
# Check if file exists
if [ ! -f "$IMAGE_FILE" ]; then
   echo "Error: File '$IMAGE_FILE' not found"
   exit 1
fi
# Create temporary payload file
TMP_PAYLOAD=$(mktemp)
# Generate the JSON payload
cat <<EOF > "$TMP_PAYLOAD"
{
 "model": "$MODEL",
 "messages": [
   {
     "role": "user",
     "content": [
       {"type": "text", "text": "Describe this image in detail"},
       {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,$(base64 -w 0 "$IMAGE_FILE")"}}
     ]
   }
 ]
}
EOF
echo "Generating description for $IMAGE_FILE..."
# Make the API request and save response
curl -s -X POST \
    -H "Content-Type: application/json" \
    -d @"$TMP_PAYLOAD" \
    "$API_URL" | jq -r '.choices[0].message.content' > "$OUTPUT_FILE"
# Clean up
rm "$TMP_PAYLOAD"
echo "Description saved to $OUTPUT_FILE" | 
Beta Was this translation helpful? Give feedback.
-
| 
 @thoddnn Yes, you can send images to the /embedding endpoint in llama.cpp, but only if you're using a multimodal model like LLaVA and you’ve specified the  To include an image in the embedding request, you need to: 
 {
  "content": "Image: [img-21].\n Optional Caption",
  "image_data": [
    {
      "id": 21,
      "data": "<BASE64_ENCODED_IMAGE_HERE>"
    }
  ]
} | 
Beta Was this translation helpful? Give feedback.
-
| I have been unable to get the following to work. What I observe is a response that seems like it has ignored the [img-N] ref and treated it as a set of text tokens. This is inferred by the fact that when issuing the command there are a) ~6 embedding vectors produced .. one per token I assume and b) it does not matter what id is provided in the image_data clause, (ie correctly "id":1 or erroneously "id":2) the embedding values produced are the same. Any thoughts? The server command line is (essentially) as follows  | 
Beta Was this translation helpful? Give feedback.
-
| I still can't get it to work.. The llama server always returns the same embedding even if the image is different starting the server with a multimodal embedding model  and send data with  | 
Beta Was this translation helpful? Give feedback.
-
| PR #15108 looks like it should fix the issue, but I’m still running into trouble when trying it with the binaries from his fork of llama.cpp. I have tried to send the following JSON to http://localhost:8080/embedding But I get the error 
 @oobabooga Could you provide some guidance on how to generate embeddings from both images and text? 🙏 Thanks a lot | 
Beta Was this translation helpful? Give feedback.
-
| @thoddnn you were really close! #15108 changes the type of the prompt itself, but not the outer JSON  | 
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm starting a llama cpp server using the following command:
llama-server -m "/path/model.gguf" --mmproj "mmproj.gguf"when I send a HTTP request to http://localhost:8080/embedding with the payload
It works and returns an embedding vector, but I would like to send an image instead of text. However, I don't know how to do this. Is it even possible with the current version?
Thanks 🙏
Beta Was this translation helpful? Give feedback.
All reactions