hw1a

Get the cookies for the drive page otherwise the script doesn't work (see official gdwon page) You also need the evaluation .csv in the root folder.

hw1b_nlp2024

After having put in the root folder of the repo the .csv file containing the list of students and their assigned datasets, run:

python rebalance_datasets_hw1b.py

The .csv schema should be as follow:

['University ID',   # mandatory in position 1
 'In need for Italian speakers or for more project members (Y/N)',
 'Group ID (make sure the ID you specify is NOT already present in previous rows, unless the student belongs to the same group!)',
 'Assigned datasets', # mandatory in position 3
 'New assigned dataset (for students previously having only one)',
 'Assigned distractor dataset (EXTRA)'
]

Example output:

> Distribution of HW1B datasets BEFORE (mean 9.333):
0  |#######
1  |##########
5  |######
7  |#######
8  |#######
9  |###############################
18 |########
21 |#######
22 |######
24 |########
27 |########
28 |#######

> Distribution of HW1B datasets AFTER (mean 8.083):
0  |######
1  |######
5  |#####
7  |#####
8  |######
9  |##############################
18 |######
21 |#####
22 |######
24 |########
27 |#######
28 |#######

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
evaluation		evaluation
.gitignore		.gitignore
README.md		README.md
evaluate-26.ipynb		evaluate-26.ipynb
hw1a.ipynb		hw1a.ipynb
hw1a_to_correct.md		hw1a_to_correct.md
hw1a_to_correct_LG.md		hw1a_to_correct_LG.md
makenv.sh		makenv.sh
original_datasets.tar.xz		original_datasets.tar.xz
rebalance_datasets_hw1b.py		rebalance_datasets_hw1b.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

hw1a

hw1b_nlp2024

About

Releases

Packages

Languages

DeerLu1220/nlp2024_homeworks

Folders and files

Latest commit

History

Repository files navigation

hw1a

hw1b_nlp2024

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages