Merge pull request meta-llama#292 from Vasanthrs-dev/patch-2

jspisak · web-flow · commit 9b60364e4f4a · 2024-07-26T09:55:16.000-07:00
Update eval_details.md
diff --git a/eval_details.md b/eval_details.md
@@ -3,7 +3,7 @@ This document contains additional context on the settings and parameters for how
 ### Auto-eval benchmark notes
 #### MMLU
 - We are reporting macro averages for MMLU benchmarks. The micro average numbers for MMLU are: 65.4 and 67.4 for the 8B pre-trained and instruct-aligned models, 78.9 and 82.0 for the 70B pre-trained and instruct-aligned models
-- The pre-trained models are evaluated in the standard way by calualting the likelihood of each choice character. For the instruct-aligned models, we use a  dialogue prompt (*user/assistant*) for the shots and ask the model to generate the best choice character as answer.
+- The pre-trained models are evaluated in the standard way by calculating the likelihood of each choice character. For the instruct-aligned models, we use a  dialogue prompt (*user/assistant*) for the shots and ask the model to generate the best choice character as answer.
 #### AGI English
 - We use the default few-shot and prompt settings as specified [here](https://github.com/ruixiangcui/AGIEval). The score is averaged over the english subtasks.
 #### CommonSenseQA