Closed
Description
Hi
I need some helps.
I am trying to evaluate my pre-trained model with Thai fine tasks. Here is my command
export CUDA_VISIBLE_DEVICES="0,1"
echo "Running lighteval for model: meta-llama/Llama-3.2-3B"
lighteval accelerate \
"pretrained=meta-llama/Llama-3.2-3B,dtype=bfloat16,model_parallel=True" \
"examples/tasks/fine_tasks/mcf/th.txt" \
--custom-tasks "src/lighteval/tasks/multilingual/tasks.py" \
--dataset-loading-processes 8 \
--cache-dir "./le_cache" \
--no-use-chat-template \
--override-batch-size 4
When I ran this command I got error
OutOfMemoryError: CUDA out of memory. Tried to allocate 6.02 GiB. GPU 0 has a total capacity of 39.59 GiB of which 5.52 GiB is free.
Process 35878 has 674.00 MiB memory in use. Process 32242 has 33.41 GiB memory in use. Of the allocated memory 28.39 GiB is allocated by
PyTorch, and 3.70 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management
(https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
Which is kind of strange to me. My batch size is very small and here is my GPU machines spec

The only thing I can guess is that I have 7 dataset to evaluate but I still got no idea
# mcf.th.txt
# General Knowledge (GK)
lighteval|meta_mmlu_tha_mcf|5|1
lighteval|m3exams_tha_mcf|5|1
# Reading Comprehension (RC)
lighteval|belebele_tha_Thai_mcf|5|1
lighteval|thaiqa_tha|5|1
lighteval|xquad_tha|5|1
# Natural Language Understanding (NLU)
lighteval|community_hellaswag_tha_mcf|5|1
lighteval|xnli2.0_tha_mcf|5|1
Any Ideas?
I use lighteval 0.6.0.dev0 and torch 2.2.2+cu121. I clone this repo and pip install -e .[dev]
Metadata
Metadata
Assignees
Labels
No labels