Support for multi-modal models

I see LLama.cpp is working on multi-modal models like LLaVA:
https://github.com/ggerganov/llama.cpp/pull/3436

Model is [here](https://huggingface.co/mys/ggml_llava-v1.5-7b/tree/main):
```
2ab9be51b7dc737136b38093316a4d3577d1fb96281f1589adac7841f5b81c43  ../models/ggml-model-q5_k.gguf
b7c8ff0f58fca47d28ba92c4443adf8653f3349282cb8d9e6911f22d9b3814fe  ../models/mmproj-model-f16.gguf
```

Testing:
```
$ mkdir build && cd build && cmake ..
$ cmake --build .
$ ./bin/llava -m ../models/ggml-model-q5_k.gguf --mmproj ../models/mmproj-model-f16.gguf --image ~/Desktop/Papers/figure-3-1.jpg
```

Appears to add some new params:
```
--mmproj MMPROJ_FILE  path to a multimodal projector file for LLaVA. see examples/llava/README.md
--image IMAGE_FILE    path to an image file. use with multimodal models
```

It would be awesome if we can support in `llama-cpp-python`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for multi-modal models #813

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Support for multi-modal models #813

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions