Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
77169b0
Add RAGbenchmark: RAG system evaluation framework
sponge225 Mar 19, 2026
024e0a2
Update README.md
sponge225 Mar 19, 2026
fa9dfc9
Update README.md
sponge225 Mar 19, 2026
6ed11f5
Merge branch 'volcengine:main' into feat/rag
sponge225 Mar 20, 2026
e7a7130
Merge branch 'volcengine:main' into feat/rag
sponge225 Mar 20, 2026
eb01792
Merge branch 'volcengine:main' into feat/rag
sponge225 Mar 20, 2026
d1b6858
Update README.md
sponge225 Mar 20, 2026
ec5bae3
Code structure refactoring
sponge225 Mar 20, 2026
a45c19b
Merge branch 'volcengine:main' into feat/rag
sponge225 Mar 20, 2026
f56854f
feat: improve RAG benchmark with dataset sampling and configuration u…
sponge225 Mar 24, 2026
fd883e4
feat: add stratified sampling support to all datasets
sponge225 Mar 24, 2026
e3280d3
Update locomo adapter to support image attachments and other improvem…
sponge225 Mar 24, 2026
8473e67
Update dataset documentation with actual document counts
sponge225 Mar 24, 2026
5e179c1
Merge remote-tracking branch 'upstream/main' into merge-upstream-main
sponge225 Mar 24, 2026
01e6804
Add benchmark results reference and reproduction steps
sponge225 Mar 25, 2026
47e41b1
Improve sampling scripts for benchmark reproducibility
sponge225 Mar 25, 2026
0d11aa8
Refactor sample_dataset.py: extract common sampling logic
sponge225 Mar 25, 2026
42c468e
Update config.yaml: improve configuration structure
sponge225 Mar 25, 2026
510707d
Fix bug: duplicate worker_end() call in generation failure path
sponge225 Mar 25, 2026
7e85ea6
Fix bug: _get_required_syllabi() doesn't support JSON input
sponge225 Mar 25, 2026
625f840
Improve exception re-raising: use bare raise to preserve traceback
sponge225 Mar 25, 2026
800857f
Fix bug: Locomo prompt uses raw gold_answer instead of gold_answer_str
sponge225 Mar 25, 2026
2ff2b46
Improve directory ingest: use os.path.commonpath() for robustness
sponge225 Mar 25, 2026
19db010
benchmark: honor skip_ingestion and fail on LLM retry exhaustion
sponge225 Mar 30, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,15 @@ Temporary Items
.openviking
*.code-workspace

# Benchmark outputs
examples/benchmark/outputs/
examples/benchmark/datasets/full/
examples/benchmark/*.log
RAGbenchmark/datasets/*/
!RAGbenchmark/datasets/Benchmark_Lite/
RAGbenchmark/Output/
RAGbenchmark/*.log

# AI Coding
CLAUDE.md
*.so
Expand Down
53 changes: 53 additions & 0 deletions benchmark/RAG/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Raw datasets (downloaded from external sources)
raw_data/

# Processed datasets (sampled subsets)
datasets/
data/

# Processed documents and vector storage
ov_storage/

# Evaluation output results
Output/

# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg

# Virtual Environment
.venv/
env/
ENV/

# IDE
.vscode/
.idea/
*.swp
*.swo
*~

# Logs
*.log

# Temporary files
*.tmp
*.temp
Loading
Loading