A Streamlit component that enables a multimodal chat interface, allowing users to input text and upload images within Streamlit applications. Demo video at Streamlit Community Forum (can't believe they didn't make this themselves in the age of multimodal LLMs & RAG systems).
- Text Input: Users can type in their messages.
- Image Upload: Supports uploading images, enhancing the chat with a visual element.
- Clipboard Paste: Enables pasting images directly from the clipboard.
- Responsive Design: Adjusts to the width of the Streamlit container.
- Disabled State: Can be set to a disabled state, making the input and button non-interactive and visually distinct.
To install the component, run the following command:
pip install st-multimodal-chatinput
import streamlit as st
from st_multimodal_chatinput import multimodal_chatinput
chatinput = multimodal_chatinput()
uploaded_files = chatinput["uploadedFiles"] ##list of ALL uploaded files (including images) along with type, name, and content.
uploaded_images = chatinput["uploadedImages"] ## list of base 64 encoding of uploaded images
text = chatinput["textInput"] ##submitted text
for file in uploaded_files:
filename = file["name"]
filetype = file["type"] ## MIME type of the uploaded file
filecontent = file["content"] ## base 64 encoding of the uploaded file
File Extension | MIME Type |
---|---|
application/pdf | |
.doc | application/msword |
.csv | text/csv |
You can find other common types here.
Pull requests for some of the to-dos are more than welcome.