Adding a range of multilingual evals #832

clefourrier · 2025-06-25T14:32:34Z

No description provided.

lewtun · 2025-06-26T07:44:02Z

src/lighteval/tasks/extended/misc/instruct.py

+
+BELEBELE_TASKS = [
+    LightevalTaskConfig(
+        name=f"belebele_instruct_{lang}_Latn",


Maybe we should call this belebele_instruct_5_{lang}_Latn or belebele_instruct_smollm_{lang}_Latn to distinguish from the general case with more languages?

The alternative would be to have a separate belebele_instruct_en_{lang}_{script} for the full set of languages, but with English instructions

Will add the latter, and have a
`belebele_native_inst_{lang}" vs "belebele_en_inst_{lang}"
:)

* too many false positives with the current gpqa metric extraction, making it more string * fixing whitespace and instruction in prompt * better to have a strict extraction for index extraction in general actually * added comment * fix tests, need to invert condition

Translations provided by Kairit Sirts

clefourrier and others added 5 commits June 25, 2025 14:29

added instruct specific evals

5b3cd26

tmp

75af52d

fix

ffa43cc

fix

a94973a

tmp

2852db8

clefourrier marked this pull request as draft June 25, 2025 16:02

focused data on actually existing languages

a549107

lewtun reviewed Jun 26, 2025

View reviewed changes

clefourrier and others added 5 commits June 26, 2025 09:23

Complete TranslationLiterals for Language.ESTONIAN (#779)

bc5a564

Translations provided by Kairit Sirts

belebele split

bae9544

updated prompts/instruction management

78a5c23

tmp

91ad573

clefourrier force-pushed the clem_mmlupro branch from ac99d10 to 91ad573 Compare June 26, 2025 09:24

clefourrier added 4 commits June 26, 2025 09:26

added enforce eager

6f5d92f

small prompt fix

2a10826

one file per eval

2114c61

adding mgsm

ee6149e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding a range of multilingual evals #832

Adding a range of multilingual evals #832

Uh oh!

clefourrier commented Jun 25, 2025

Uh oh!

lewtun Jun 26, 2025

Uh oh!

clefourrier Jun 26, 2025 •

edited

Loading

Uh oh!

Uh oh!

Adding a range of multilingual evals #832

Are you sure you want to change the base?

Adding a range of multilingual evals #832

Uh oh!

Conversation

clefourrier commented Jun 25, 2025

Uh oh!

lewtun Jun 26, 2025

Choose a reason for hiding this comment

Uh oh!

clefourrier Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

clefourrier Jun 26, 2025 •

edited

Loading