Skip to content
This repository was archived by the owner on Mar 17, 2025. It is now read-only.

Commit b3112de

Browse files
authored
Move llama-dataset metadata from llama-hub repo to llama-index (run-llama#11488)
* copy dataset metadata * update library.json and download modules * working * pants tailor * use main not dev branch
1 parent f508073 commit b3112de

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

81 files changed

+2552
-11
lines changed

docs/BUILD

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
python_sources()
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
python_sources()
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
python_sources()

docs/examples/output_parsing/BUILD

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
python_sources()

llama-datasets/10k/uber_2021/BUILD

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
python_sources()
+61
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# Uber 10K Dataset 2021
2+
3+
## CLI Usage
4+
5+
You can download `llamadatasets` directly using `llamaindex-cli`, which comes installed with the `llama-index` python package:
6+
7+
```bash
8+
llamaindex-cli download-llamadataset Uber10KDataset2021 --download-dir ./data
9+
```
10+
11+
You can then inspect the files at `./data`. When you're ready to load the data into
12+
python, you can use the below snippet of code:
13+
14+
```python
15+
from llama_index.core import SimpleDirectoryReader
16+
from llama_index.core.llama_dataset import LabelledRagDataset
17+
18+
rag_dataset = LabelledRagDataset.from_json("./data/rag_dataset.json")
19+
documents = SimpleDirectoryReader(input_dir="./data/source_files").load_data()
20+
```
21+
22+
## Code Usage
23+
24+
You can download the dataset to a directory, say `./data` directly in Python
25+
as well. From there, you can use the convenient `RagEvaluatorPack` llamapack to
26+
run your own LlamaIndex RAG pipeline with the `llamadataset`.
27+
28+
```python
29+
from llama_index.core.llama_dataset import download_llama_dataset
30+
from llama_index.core.llama_pack import download_llama_pack
31+
from llama_index.core import VectorStoreIndex
32+
33+
# download and install dependencies for benchmark dataset
34+
rag_dataset, documents = download_llama_dataset("Uber10KDataset2021", "./data")
35+
36+
# build basic RAG system
37+
index = VectorStoreIndex.from_documents(documents=documents)
38+
query_engine = index.as_query_engine()
39+
40+
# evaluate using the RagEvaluatorPack
41+
RagEvaluatorPack = download_llama_pack(
42+
"RagEvaluatorPack", "./rag_evaluator_pack"
43+
)
44+
rag_evaluator_pack = RagEvaluatorPack(
45+
rag_dataset=rag_dataset,
46+
query_engine=query_engine,
47+
show_progress=True,
48+
)
49+
50+
############################################################################
51+
# NOTE: If have a lower tier subscription for OpenAI API like Usage Tier 1 #
52+
# then you'll need to use different batch_size and sleep_time_in_seconds. #
53+
# For Usage Tier 1, settings that seemed to work well were batch_size=5, #
54+
# and sleep_time_in_seconds=15 (as of December 2023.) #
55+
############################################################################
56+
57+
benchmark_df = await rag_evaluator_pack.arun(
58+
batch_size=20, # batches the number of openai api calls to make
59+
sleep_time_in_seconds=1, # seconds to sleep before making an api call
60+
)
61+
```
+27
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
{
2+
"name": "Uber 10K Dataset 2021",
3+
"className": "LabelledRagDataset",
4+
"description": "A labelled RAG dataset based on the Uber 2021 10K document, consisting of queries, reference answers, and reference contexts.",
5+
"numberObservations": 822,
6+
"containsExamplesByHumans": false,
7+
"containsExamplesByAi": true,
8+
"sourceUrls": [],
9+
"baselines": [
10+
{
11+
"name": "llamaindex",
12+
"config": {
13+
"chunkSize": 1024,
14+
"llm": "gpt-3.5-turbo",
15+
"similarityTopK": 2,
16+
"embedModel": "text-embedding-ada-002"
17+
},
18+
"metrics": {
19+
"contextSimilarity": 0.943,
20+
"correctness": 3.874,
21+
"faithfulness": 0.667,
22+
"relevancy": 0.844
23+
},
24+
"codeUrl": "https://github.com/run-llama/llama-hub/blob/main/llama_hub/llama_datasets/10k/uber_2021/llamaindex_baseline.py"
25+
}
26+
]
27+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
import asyncio
2+
3+
from llama_index.core.llama_dataset import download_llama_dataset
4+
from llama_index.core.llama_pack import download_llama_pack
5+
from llama_index.core import VectorStoreIndex
6+
from llama_index.llms import OpenAI
7+
8+
9+
async def main():
10+
# DOWNLOAD LLAMADATASET
11+
rag_dataset, documents = download_llama_dataset(
12+
"Uber10KDataset2021", "./uber10k_2021_dataset"
13+
)
14+
15+
# BUILD BASIC RAG PIPELINE
16+
index = VectorStoreIndex.from_documents(documents=documents)
17+
query_engine = index.as_query_engine()
18+
19+
# EVALUATE WITH PACK
20+
RagEvaluatorPack = download_llama_pack("RagEvaluatorPack", "./pack_stuff")
21+
judge_llm = OpenAI(model="gpt-3.5-turbo")
22+
rag_evaluator = RagEvaluatorPack(
23+
query_engine=query_engine, rag_dataset=rag_dataset, judge_llm=judge_llm
24+
)
25+
26+
############################################################################
27+
# NOTE: If have a lower tier subscription for OpenAI API like Usage Tier 1 #
28+
# then you'll need to use different batch_size and sleep_time_in_seconds. #
29+
# For Usage Tier 1, settings that seemed to work well were batch_size=5, #
30+
# and sleep_time_in_seconds=15 (as of December 2023.) #
31+
############################################################################
32+
benchmark_df = await rag_evaluator.arun(
33+
batch_size=20, # batches the number of openai api calls to make
34+
sleep_time_in_seconds=1, # number of seconds sleep before making an api call
35+
)
36+
print(benchmark_df)
37+
38+
39+
if __name__ == "__main__":
40+
loop = asyncio.get_event_loop()
41+
loop.run_until_complete(main)

llama-datasets/__init__.py

Whitespace-only changes.
+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
python_sources()
+61
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# Blockchain Solana Dataset
2+
3+
## CLI Usage
4+
5+
You can download `llamadatasets` directly using `llamaindex-cli`, which comes installed with the `llama-index` python package:
6+
7+
```bash
8+
llamaindex-cli download-llamadataset BlockchainSolanaDataset --download-dir ./data
9+
```
10+
11+
You can then inspect the files at `./data`. When you're ready to load the data into
12+
python, you can use the below snippet of code:
13+
14+
```python
15+
from llama_index.core import SimpleDirectoryReader
16+
from llama_index.core.llama_dataset import LabelledRagDataset
17+
18+
rag_dataset = LabelledRagDataset.from_json("./data/rag_dataset.json")
19+
documents = SimpleDirectoryReader(input_dir="./data/source_files").load_data()
20+
```
21+
22+
## Code Usage
23+
24+
You can download the dataset to a directory, say `./data` directly in Python
25+
as well. From there, you can use the convenient `RagEvaluatorPack` llamapack to
26+
run your own LlamaIndex RAG pipeline with the `llamadataset`.
27+
28+
```python
29+
from llama_index.core.llama_dataset import download_llama_dataset
30+
from llama_index.core.llama_pack import download_llama_pack
31+
from llama_index.core import VectorStoreIndex
32+
33+
# download and install dependencies for benchmark dataset
34+
rag_dataset, documents = download_llama_dataset(
35+
"BlockchainSolanaDataset", "./data"
36+
)
37+
38+
# build basic RAG system
39+
index = VectorStoreIndex.from_documents(documents=documents)
40+
query_engine = index.as_query_engine()
41+
42+
# evaluate using the RagEvaluatorPack
43+
RagEvaluatorPack = download_llama_pack(
44+
"RagEvaluatorPack", "./rag_evaluator_pack"
45+
)
46+
rag_evaluator_pack = RagEvaluatorPack(
47+
rag_dataset=rag_dataset, query_engine=query_engine
48+
)
49+
50+
############################################################################
51+
# NOTE: If have a lower tier subscription for OpenAI API like Usage Tier 1 #
52+
# then you'll need to use different batch_size and sleep_time_in_seconds. #
53+
# For Usage Tier 1, settings that seemed to work well were batch_size=5, #
54+
# and sleep_time_in_seconds=15 (as of December 2023.) #
55+
############################################################################
56+
57+
benchmark_df = await rag_evaluator_pack.arun(
58+
batch_size=20, # batches the number of openai api calls to make
59+
sleep_time_in_seconds=1, # seconds to sleep before making an api call
60+
)
61+
```
+27
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
{
2+
"name": "Blockchain Solana",
3+
"className": "LabelledRagDataset",
4+
"description": "A labelled RAG dataset based off an article, From Bitcoin to Solana – Innovating Blockchain towards Enterprise Applications),by Xiangyu Li, Xinyu Wang, Tingli Kong, Junhao Zheng and Min Luo, consisting of queries, reference answers, and reference contexts.",
5+
"numberObservations": 58,
6+
"containsExamplesByHumans": false,
7+
"containsExamplesByAi": true,
8+
"sourceUrls": ["https://arxiv.org/abs/2207.05240"],
9+
"baselines": [
10+
{
11+
"name": "llamaindex",
12+
"config": {
13+
"chunkSize": 1024,
14+
"llm": "gpt-3.5-turbo",
15+
"similarityTopK": 2,
16+
"embedModel": "text-embedding-ada-002"
17+
},
18+
"metrics": {
19+
"contextSimilarity": 0.945,
20+
"correctness": 4.457,
21+
"faithfulness": 1.0,
22+
"relevancy": 1.0
23+
},
24+
"codeUrl": "https://github.com/run-llama/llama-hub/blob/main/llama_hub/llama_datasets/blockchain_solana/llamaindex_baseline.py"
25+
}
26+
]
27+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
import asyncio
2+
3+
from llama_index.core.llama_dataset import download_llama_dataset
4+
from llama_index.core.llama_pack import download_llama_pack
5+
from llama_index.core import VectorStoreIndex
6+
7+
8+
async def main():
9+
# DOWNLOAD LLAMADATASET
10+
rag_dataset, documents = download_llama_dataset(
11+
"BlockchainSolanaDataset", "./blockchain_solana"
12+
)
13+
14+
# BUILD BASIC RAG PIPELINE
15+
index = VectorStoreIndex.from_documents(documents=documents)
16+
query_engine = index.as_query_engine()
17+
18+
# EVALUATE WITH PACK
19+
RagEvaluatorPack = download_llama_pack("RagEvaluatorPack", "./pack_stuff")
20+
rag_evaluator = RagEvaluatorPack(query_engine=query_engine, rag_dataset=rag_dataset)
21+
22+
############################################################################
23+
# NOTE: If have a lower tier subscription for OpenAI API like Usage Tier 1 #
24+
# then you'll need to use different batch_size and sleep_time_in_seconds. #
25+
# For Usage Tier 1, settings that seemed to work well were batch_size=5, #
26+
# and sleep_time_in_seconds=15 (as of December 2023.) #
27+
############################################################################
28+
benchmark_df = await rag_evaluator.arun(
29+
batch_size=20, # batches the number of openai api calls to make
30+
sleep_time_in_seconds=1, # number of seconds sleep before making an api call
31+
)
32+
print(benchmark_df)
33+
34+
35+
if __name__ == "__main__":
36+
loop = asyncio.get_event_loop()
37+
loop.run_until_complete(main)

llama-datasets/braintrust_coda/BUILD

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
python_sources()
+65
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
# Braintrust Coda Help Desk Dataset
2+
3+
[![Braintrust (346 x 40 px)](https://github.com/nerdai/llama-hub/assets/92402603/a99bddf3-0eab-42e8-8c53-8432da8299d3)](https://www.braintrustdata.com/)
4+
5+
_This dataset was kindly provided by Kenny Wong and Ankur Goyal._
6+
7+
## CLI Usage
8+
9+
You can download `llamadatasets` directly using `llamaindex-cli`, which comes installed with the `llama-index` python package:
10+
11+
```bash
12+
llamaindex-cli download-llamadataset BraintrustCodaHelpDeskDataset --download-dir ./data
13+
```
14+
15+
You can then inspect the files at `./data`. When you're ready to load the data into
16+
python, you can use the below snippet of code:
17+
18+
```python
19+
from llama_index.core import SimpleDirectoryReader
20+
from llama_index.core.llama_dataset import LabelledRagDataset
21+
22+
rag_dataset = LabelledRagDataset.from_json("./data/rag_dataset.json")
23+
documents = SimpleDirectoryReader(input_dir="./data/source_files").load_data()
24+
```
25+
26+
## Code Usage
27+
28+
You can download the dataset to a directory, say `./data` directly in Python
29+
as well. From there, you can use the convenient `RagEvaluatorPack` llamapack to
30+
run your own LlamaIndex RAG pipeline with the `llamadataset`.
31+
32+
```python
33+
from llama_index.core.llama_dataset import download_llama_dataset
34+
from llama_index.core.llama_pack import download_llama_pack
35+
from llama_index.core import VectorStoreIndex
36+
37+
# download and install dependencies for benchmark dataset
38+
rag_dataset, documents = download_llama_dataset(
39+
"BraintrustCodaHelpDeskDataset", "./data"
40+
)
41+
42+
# build basic RAG system
43+
index = VectorStoreIndex.from_documents(documents=documents)
44+
query_engine = index.as_query_engine()
45+
46+
# evaluate using the RagEvaluatorPack
47+
RagEvaluatorPack = download_llama_pack(
48+
"RagEvaluatorPack", "./rag_evaluator_pack"
49+
)
50+
rag_evaluator_pack = RagEvaluatorPack(
51+
rag_dataset=rag_dataset, query_engine=query_engine
52+
)
53+
54+
############################################################################
55+
# NOTE: If have a lower tier subscription for OpenAI API like Usage Tier 1 #
56+
# then you'll need to use different batch_size and sleep_time_in_seconds. #
57+
# For Usage Tier 1, settings that seemed to work well were batch_size=5, #
58+
# and sleep_time_in_seconds=15 (as of December 2023.) #
59+
############################################################################
60+
61+
benchmark_df = await rag_evaluator_pack.arun(
62+
batch_size=20, # batches the number of openai api calls to make
63+
sleep_time_in_seconds=1, # seconds to sleep before making an api call
64+
)
65+
```

llama-datasets/braintrust_coda/__init__.py

Whitespace-only changes.
+29
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
{
2+
"name": "Braintrust Coda Help Desk",
3+
"className": "LabelledRagDataset",
4+
"description": "A list of automatically generated question/answer pairs from the Coda (https://coda.io/) help docs. This dataset is interesting because most models include Coda’s documentation as part of their training set, so you can baseline performance without RAG.",
5+
"numberObservations": 100,
6+
"containsExamplesByHumans": false,
7+
"containsExamplesByAi": true,
8+
"sourceUrls": [
9+
"https://gist.githubusercontent.com/wong-codaio/b8ea0e087f800971ca5ec9eef617273e/raw/39f8bd2ebdecee485021e20f2c1d40fd649a4c77/articles.json"
10+
],
11+
"baselines": [
12+
{
13+
"name": "llamaindex",
14+
"config": {
15+
"chunkSize": 1024,
16+
"llm": "gpt-3.5-turbo",
17+
"similarityTopK": 2,
18+
"embedModel": "text-embedding-ada-002"
19+
},
20+
"metrics": {
21+
"contextSimilarity": 0.955,
22+
"correctness": 4.32,
23+
"faithfulness": 0.9,
24+
"relevancy": 0.93
25+
},
26+
"codeUrl": "https://github.com/run-llama/llama-hub/blob/main/llama_hub/llama_datasets/braintrust_coda/llamaindex_baseline.py"
27+
}
28+
]
29+
}

0 commit comments

Comments
 (0)