Table of Contents
This project is about the evaluation of several RAG pipelines. Each of them are evaluated by several methods such as the rag-mini-wikipedia dataset and the TARGET benchmark.
- Python 3.1x
- Linux machine if you are running gemma.cpp RAG pipeline
- Install python packages
pip install -r requirements.txt
- Install TARGET benchmark from source
cd target
pip install -e .If using AzureOpenAI API
- Enter your credentials in
.env
MODEL=''
OPENAI_API_BASE=''
OPENAI_API_KEY=''
API_VERSION=''
OPENAI_ORGANIZATION=''
- To use the gemma.cpp RAG pipeline go into RAG_GemmaCPP README
- To use the LlamaIndex RAG pipeline
python main.py- To use LLM only for QA tasks
python llm_query.py- To run TARGET benchmark
python run_target_benchmark.py- To visualize any experiments look into compute_stats_viz.py
- To evaluate any RAG pipeline
cd Eval
python evaluation.pyAn example workflow would be running LlamaIndex RAG pipeline -> evaluate results -> visualize results