Skip to content

Commit b1ff137

Browse files
authored
Merge pull request #541 from macrocosm-os/staging
v2.16.1
2 parents c843406 + b453b93 commit b1ff137

36 files changed

+781
-807
lines changed

.env.validator.example

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,9 @@ SN19_API_URL = "e.g. http://24.199.112.174:4051/"
2424
OPENAI_API_KEY = "your_openai_api_key_here"
2525
HF_TOKEN = "your_huggingface_token_here"
2626

27-
# Scoring API.
27+
# Scoring API (optional).
2828
DEPLOY_SCORING_API = true
2929
SCORING_ADMIN_KEY = "123456"
3030
SCORING_API_PORT = 8094
31+
# Scoring key must match the scoring key in the .env.api.
32+
# SCORING_KEY="..."

README.md

Lines changed: 7 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -49,24 +49,21 @@ Subnet one utilizes the concept of "Tasks" to control the behavior of miners. Va
4949
### 1. **QA (Question Answering)**
5050
The miner receives a question about a specific section from a Wikipedia page. The miner must then find the original context in the specified section and use it to return an accurate answer. References are generated using the validators privileged knowledge of the context, and miner complestions are scored based on similarity metrics.
5151

52-
### 2. **Summarization**
53-
Similar to QA, but the miner uses the entire Wikipedia page instead of a specific section. The miner reads the whole page, summarizes it, and provides a concise answer.
54-
55-
### 3. **DateQA**
56-
The miner receives a question about an event from Wikipedia. The miner must search through Wikipedia for the relevant event and return the correct answer based on the findings. References are again generated with validator's knowledge of the context, and similarity metrics are used to score miner completions.
57-
58-
### 4. **Inference**
52+
### 2. **Inference**
5953
A question is given with some pre-seeded information and a random seed. The miner must perform an inference based on this information to provide the correct answer. Completions are scored based on similarity metrics.
6054

61-
### 5. **MultiChoice**
55+
### 3. **MultiChoice**
6256
The miner is presented with a question from Wikipedia along with four possible answers (A, B, C, or D). The miner must search Wikipedia and return the correct answer by selecting one of the given options. Miner completions are scored by Regex matching.
6357

64-
### 6. **Programming**
58+
### 5. **Programming**
6559
The miner receives a code snippet that is incomplete. The task is to complete the code snippet to perform its intended function. The validator generates a reference using it's internal LLM, and the miner is scored based on its similarity to this reference.
6660

67-
### 7. **Web Retrieval**
61+
### 6. **Web Retrieval**
6862
The miner is given a question based on a random web page and must return a scraped website that contains the answer. This requires searching the web to locate the most accurate and reliable source to provide the answer. The miner is scored based on the embedding similarity between the answer it returns and the original website that the validator generated the reference from.
6963

64+
### 7. **Multistep Reasoning**
65+
The miner is given a complex problem that requires multiple steps to solve. Each step builds upon the previous one, and the miner must provide intermediate results before arriving at the final answer. The validator generates a reference solution using its internal LLM, and the miner is scored based on the accuracy and coherence of the intermediate and final results.
66+
7067
# API Documentation
7168

7269
For detailed information on the available API endpoints, request/response formats, and usage examples, please refer to the [API Documentation](./validator_api/API_docs.md).

docs/SN1_validation.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,6 @@ More tooling will be included in future releases.
3131
# Tasks
3232
The validation process supports an ever-growing number of tasks. Tasks drive agent behaviour based on specific goals, such as;
3333
- Question answering
34-
- Summarization
3534
- Code debugging
3635
- Mathematics
3736
and more.

neurons/validator.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222

2323
torch.multiprocessing.set_start_method("spawn", force=True)
2424

25-
NEURON_SAMPLE_SIZE = 100
25+
NEURON_SAMPLE_SIZE = 100 # TODO: Should add this to constants.py
2626

2727

2828
def create_loop_process(task_queue, scoring_queue, reward_events):

poetry.lock

Lines changed: 600 additions & 410 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

prompting/api/scoring/api.py

Lines changed: 22 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -27,29 +27,34 @@ async def score_response(request: Request, api_key_data: dict = Depends(validate
2727
model = None
2828
payload: dict[str, Any] = await request.json()
2929
body = payload.get("body")
30-
31-
try:
32-
if body.get("model") is not None:
33-
model = ModelZoo.get_model_by_id(body.get("model"))
34-
except Exception:
35-
logger.warning(
36-
f"Organic request with model {body.get('model')} made but the model cannot be found in model zoo. Skipping scoring."
37-
)
38-
return
3930
uid = int(payload.get("uid"))
4031
chunks = payload.get("chunks")
41-
llm_model = ModelZoo.get_model_by_id(model) if (model := body.get("model")) else None
32+
model = body.get("model")
33+
if model:
34+
try:
35+
llm_model = ModelZoo.get_model_by_id(model)
36+
except Exception:
37+
logger.warning(
38+
f"Organic request with model {body.get('model')} made but the model cannot be found in model zoo. Skipping scoring."
39+
)
40+
return
41+
else:
42+
llm_model = None
4243
task = body.get("task")
4344
if task == "InferenceTask":
4445
logger.info(f"Received Organic InferenceTask with body: {body}")
46+
logger.info(f"With model of type {type(body.get('model'))}")
47+
organic_task = InferenceTask(
48+
messages=body.get("messages"),
49+
llm_model=llm_model,
50+
llm_model_id=body.get("model"),
51+
seed=int(body.get("seed", 0)),
52+
sampling_params=body.get("sampling_parameters", shared_settings.SAMPLING_PARAMS),
53+
query=body.get("messages")[0]["content"],
54+
)
55+
logger.info(f"Task created: {organic_task}")
4556
task_scorer.add_to_queue(
46-
task=InferenceTask(
47-
messages=[msg["content"] for msg in body.get("messages")],
48-
llm_model=llm_model,
49-
llm_model_id=body.get("model"),
50-
seed=int(body.get("seed", 0)),
51-
sampling_params=body.get("sampling_params", {}),
52-
),
57+
task=organic_task,
5358
response=DendriteResponseEvent(
5459
uids=[uid],
5560
stream_results=[

prompting/base/duckduckgo_patch.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
from threading import Event
12
from typing import cast
23

34
import httpx
@@ -13,6 +14,7 @@ def __init__(self, *args, **kwargs):
1314
timeout=kwargs.get("timeout", 10),
1415
verify=kwargs.get("verify", True),
1516
)
17+
self._exception_event = Event()
1618

1719
def _get_url(
1820
self: DDGS,

prompting/datasets/huggingface_github.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ class HuggingFaceGithubDatasetEntry(DatasetEntry):
2020
github_url: str
2121
file_path: str
2222
file_content: str
23+
source: str | None = None
2324

2425

2526
class HuggingFaceGithubDataset(BaseDataset):
@@ -46,8 +47,9 @@ def _filter_function(self, example):
4647

4748
def _process_entry(self, entry: dict) -> HuggingFaceGithubDatasetEntry:
4849
file_content = "\n".join(entry["content"].split("\n")[:MAX_LINES])
50+
url = f"https://github.com/{entry['repo_name']}"
4951
return HuggingFaceGithubDatasetEntry(
50-
github_url=f"https://github.com/{entry['repo_name']}", file_path=entry["path"], file_content=file_content
52+
github_url=url, file_path=entry["path"], file_content=file_content, source=url
5153
)
5254

5355
def get(self) -> HuggingFaceGithubDatasetEntry:

prompting/datasets/random_website.py

Lines changed: 15 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -17,21 +17,24 @@ class DDGDatasetEntry(DatasetEntry):
1717
search_term: str
1818
website_url: str = None
1919
website_content: str = None
20+
query: str | None = None
21+
source: str | None = None
2022

2123

2224
class DDGDataset(BaseDataset):
2325
english_words: list[str] = None
2426

2527
def search_random_term(self, retries: int = 3) -> tuple[Optional[str], Optional[list[dict[str, str]]]]:
26-
try:
27-
ddg = PatchedDDGS(proxy=shared_settings.PROXY_URL, verify=False)
28-
for _ in range(retries):
29-
random_words = " ".join(random.sample(ENGLISH_WORDS, 5))
28+
ddg = PatchedDDGS(proxy=shared_settings.PROXY_URL, verify=False)
29+
for _ in range(retries):
30+
random_words = " ".join(random.sample(ENGLISH_WORDS, 3))
31+
try:
3032
results = list(ddg.text(random_words))
3133
if results:
3234
return random_words, results
33-
except Exception as ex:
34-
logger.error(f"Failed to get search results from DuckDuckGo: {ex}")
35+
except Exception as ex:
36+
logger.debug(f"Failed to get search results from DuckDuckGo: {ex}")
37+
logger.warning(f"Failed to get search results from DuckDuckGo after {retries} tries")
3538
return None, None
3639

3740
@staticmethod
@@ -41,19 +44,21 @@ def extract_website_content(url: str) -> Optional[str]:
4144
extracted = trafilatura.extract(website)
4245
return extracted[:MAX_CHARS] if extracted else None
4346
except Exception as ex:
44-
logger.error(f"Failed to extract content from website {url}: {ex}")
47+
logger.debug(f"Failed to extract content from website {url}: {ex}")
4548

4649
def next(self) -> Optional[DDGDatasetEntry]:
47-
search_term, results = self.search_random_term(retries=3)
50+
search_term, results = self.search_random_term(retries=5)
4851
if not results:
4952
return None
5053
website_url = results[0]["href"]
5154
website_content = self.extract_website_content(website_url)
5255
if not website_content or len(website_content) == 0:
53-
logger.error(f"Failed to extract content from website {website_url}")
56+
logger.debug(f"Failed to extract content from website {website_url}")
5457
return None
5558

56-
return DDGDatasetEntry(search_term=search_term, website_url=website_url, website_content=website_content)
59+
return DDGDatasetEntry(
60+
search_term=search_term, website_url=website_url, website_content=website_content, source=website_url
61+
)
5762

5863
def get(self) -> Optional[DDGDatasetEntry]:
5964
return self.next()

prompting/datasets/sn13.py

Lines changed: 6 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,10 @@
22
from typing import ClassVar
33

44
import datasets
5-
import nltk
6-
from nltk.corpus import wordnet
75
from pydantic import model_validator
86

97
from shared.base import BaseDataset, ChatEntry
108

11-
nltk.download("wordnet")
12-
139

1410
class SN13Dataset(BaseDataset):
1511
_url: ClassVar[str] = "arrmlet/x_dataset_218"
@@ -41,51 +37,10 @@ def sample(self) -> ChatEntry:
4137
if self.exception is not None:
4238
raise self.exception
4339
# Randomly select a sample from the dataset.
44-
sample_idx = random.randint(0, len(self.dataset) - 1)
45-
message = self.dataset[sample_idx]["text"]
46-
role = ["user"]
47-
48-
# Augment the messages by modifying words and introducing errors.
49-
messages = [self._augment_message(role, message)]
50-
51-
return ChatEntry(roles=role, messages=messages, organic=False, source=self._url)
52-
53-
def _augment_message(self, role: str, message: str) -> str:
54-
if role == "assistant":
55-
return message
56-
57-
words = message.split()
58-
num_words_to_modify = random.randint(1, max(1, int(len(words) * self._chance_word_synonym)))
59-
words_to_modify = random.sample(range(len(words)), num_words_to_modify)
60-
61-
for idx in words_to_modify:
62-
synonym = self._get_synonym(words[idx])
63-
if synonym:
64-
words[idx] = synonym
65-
66-
message = " ".join(words)
67-
message = self._introduce_typos(message)
68-
return message
69-
70-
def _get_synonym(self, word: str) -> str:
71-
synonyms = wordnet.synsets(word)
72-
if synonyms:
73-
# Choose a synonym that is not the word itself.
74-
synonym_words = [lemma.name() for lemma in synonyms[0].lemmas() if lemma.name() != word]
75-
if synonym_words:
76-
return random.choice(synonym_words)
77-
return word
78-
79-
def _introduce_typos(self, message: str) -> str:
80-
message = list(message)
81-
num_errors = random.randint(0, max(1, int(len(message) * self._chance_char_typo)))
82-
for _ in range(num_errors):
83-
error_type = random.choice(["remove", "add_space"])
84-
error_position = random.randint(0, len(message) - 1)
85-
86-
if error_type == "remove":
87-
message.pop(error_position)
88-
elif error_type == "add_space":
89-
message.insert(error_position, " ")
40+
messages = []
41+
for _ in range(4):
42+
sample_idx = random.randint(0, len(self.dataset) - 1)
43+
if message := self.dataset[sample_idx]["text"]:
44+
messages.append({"role": random.choice(["user", "assistant"]), "content": message})
9045

91-
return "".join(message)
46+
return ChatEntry(messages=messages, organic=False, source=self._url)

0 commit comments

Comments
 (0)