Skip to content

Commit 21cfcc6

Browse files
salgadocodefromthecryptJessicaGarson
authored
source code: Add Multimodal RAG with Elasticsearch Gotham City tutorial (#390)
Signed-off-by: Adrian Cole <[email protected]> Co-authored-by: Adrian Cole <[email protected]> Co-authored-by: Jess Garson <[email protected]>
1 parent 24c2e81 commit 21cfcc6

29 files changed

+1377
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
# Building a Multimodal RAG Pipeline with Elasticsearch: The Story of Gotham City
2+
3+
This repository contains the code for implementing a Multimodal Retrieval-Augmented Generation (RAG) system using Elasticsearch. The system processes and analyzes different types of evidence (images, audio, text, and depth maps) to solve a crime in Gotham City.
4+
5+
## Overview
6+
7+
The pipeline demonstrates how to:
8+
- Generate unified embeddings for multiple modalities using ImageBind
9+
- Store and search vectors efficiently in Elasticsearch
10+
- Analyze evidence using GPT-4 to generate forensic reports
11+
12+
## Prerequisites
13+
14+
- Python 3.x
15+
- Elasticsearch cluster (cloud or local)
16+
- OpenAI API key - Setup an OpenAI account and create a [secret key](https://platform.openai.com/docs/quickstart)
17+
- 8GB+ RAM
18+
- GPU (optional but recommended)
19+
20+
## Code execution
21+
22+
We provide a Google Colab notebook that allows you to explore the entire pipeline interactively:
23+
- [Open the Multimodal RAG Pipeline Notebook](notebook/01-mmrag-blog-quick-start.ipynb)
24+
- This notebook includes step-by-step instructions and explanations for each stage of the pipeline
25+
26+
27+
## Project Structure
28+
29+
```
30+
├── README.md
31+
├── requirements.txt
32+
├── notebook/
33+
│ ├── 01-mmrag-blog-quick-start.ipynb # Jupyter notebook execution
34+
├── src/
35+
│ ├── embedding_generator.py # ImageBind wrapper
36+
│ ├── elastic_manager.py # Elasticsearch operations
37+
│ └── llm_analyzer.py # GPT-4 integration
38+
├── stages/
39+
│ ├── 01-stage/ # File organization
40+
│ ├── 02-stage/ # Embedding generation
41+
│ ├── 03-stage/ # Elasticsearch indexing/search
42+
│ └── 04-stage/ # Evidence analysis
43+
└── data/ # Sample data
44+
├── images/
45+
├── audios/
46+
├── texts/
47+
└── depths/
48+
49+
```
50+
51+
## Sample Data
52+
53+
The repository includes sample evidence files:
54+
- Images: Crime scene photos and security camera footage
55+
- Audio: Suspicious sound recordings
56+
- Text: Mysterious notes and riddles
57+
- Depth Maps: 3D scene captures
58+
59+
## How It Works
60+
61+
1. **Evidence Collection**: Files are organized by modality in the `data/` directory
62+
2. **Embedding Generation**: ImageBind converts each piece of evidence into a 1024-dimensional vector
63+
3. **Vector Storage**: Elasticsearch stores embeddings with metadata for efficient retrieval
64+
4. **Similarity Search**: New evidence is compared against the database using k-NN search
65+
5. **Analysis**: GPT-4 analyzes the connections between evidence to identify suspects
66+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
Why so serious?
2+
3+
The show has just begun and you're already running
4+
While clowns are dancing and the city's stunning
5+
In the abandoned theater, a surprise awaits
6+
Come play with me before it's too late!
7+
8+
HAHAHAHAHA!
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
PRELIMINARY REPORT - GCPD
2+
Date: 01/28/2025
3+
Time: 22:30
4+
5+
Incident: Break-in and Vandalism
6+
Location: Gotham Central Bank
7+
Evidence Found:
8+
- Playing cards scattered
9+
- Smile graffiti on walls
10+
- Suspicious audio recording
11+
- Witnesses report maniacal laughter
12+
13+
Status: Under Investigation
14+
Priority Level: MAXIMUM
15+
Primary Suspect: Unknown (possible Joker involvement)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
HAHAHA!
2+
3+
Dear Detective,
4+
5+
In a city of endless night, a new game unfolds
6+
Where chaos reigns and fear takes hold
7+
I left a gift at Gotham Central Bank
8+
Time's ticking, your mind goes blank
9+
10+
The clues are there, scattered with care
11+
Each laugh echoes everywhere
12+
Midnight strikes, you won't catch me
13+
In Gotham's heart, chaos runs free!
14+
15+
With a smile,
16+
?
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
Incident Log:
2+
1. Gotham Central Bank - 22:15 - Alarm triggered
3+
2. Monarch Theater - 22:45 - Suspicious laughter reported
4+
3. Abandoned Amusement Park - 23:00 - Strange lights
5+
4. Ace Chemical Plant - 23:30 - Suspicious movement
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# Make a copy of this file with the name .env and assign values to variables
2+
3+
# How you connect to Elasticsearch: change details to your instance
4+
ELASTICSEARCH_URL=
5+
ELASTICSEARCH_API_KEY=
6+
# If not using API key, uncomment these and fill them in:
7+
# ELASTICSEARCH_USER=elastic
8+
# ELASTICSEARCH_PASSWORD=elastic
9+
10+
# OpenAI Configuration
11+
OPENAI_API_KEY=
12+
13+
# Model Configuration
14+
15+
# Optional Configuration
16+
# LOG_LEVEL=INFO
17+
# DEBUG=False

0 commit comments

Comments
 (0)