You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed, that when running finetuning with dspy 2.6.5 i run into:
ERROR dspy.clients.lm: name 'max_seq_length' is not defined
This happend, when using Mixtral hosted on OllamaProvider with a local meta-llama/Llama-3.2-1B-Instruct via LocalProvider.
The compile step fails after throwing the following output:
2025/02/20 09:16:28 INFO dspy.teleprompt.bootstrap_finetune: Preparing the student and teacher programs...
2025/02/20 09:16:28 INFO dspy.teleprompt.bootstrap_finetune: Bootstrapping data...
Average Metric: 10.00 / 10 (100.0%): 100%|██████████| 10/10 [00:00<00:00, 74.34it/s]
2025/02/20 09:16:28 INFO dspy.evaluate.evaluate: Average Metric: 10 / 10 (100.0%)
2025/02/20 09:16:28 INFO dspy.teleprompt.bootstrap_finetune: Preparing the train data...
2025/02/20 09:16:28 INFO dspy.teleprompt.bootstrap_finetune: Using 10 data points for fine-tuning the model: openai/local:meta-llama/Llama-3.2-1B-Instruct
2025/02/20 09:16:28 INFO dspy.teleprompt.bootstrap_finetune: Starting LM fine-tuning...
2025/02/20 09:16:28 INFO dspy.teleprompt.bootstrap_finetune: 1 fine-tuning job(s) to start
2025/02/20 09:16:28 INFO dspy.teleprompt.bootstrap_finetune: Starting 1 fine-tuning job(s)...
2025/02/20 09:16:28 INFO dspy.teleprompt.bootstrap_finetune: Calling lm.kill() on the LM to be fine-tuned to free up resources. This won't have any effect if the LM is not running.
2025/02/20 09:16:28 INFO dspy.clients.lm_local: No running server to kill.
2025/02/20 09:16:28 INFO dspy.clients.lm_local: Starting local training, will save to /home/vilocify/.dspy_cache/finetune/ab31db9fad110d48_prngex_meta-llama-Llama-3.2-1B-Instruct_2025-02-20_09-16-28
2025/02/20 09:16:31 INFO dspy.clients.lm_local: Using device: cuda
2025/02/20 09:16:33 INFO dspy.clients.lm_local: Adding pad token to tokenizer
2025/02/20 09:16:33 INFO dspy.clients.lm_local: Creating dataset
Map: 0%
0/10 [00:00<?, ? examples/s]
2025/02/20 09:16:33 ERROR dspy.clients.lm: name 'max_seq_length' is not defined
2025/02/20 09:16:33 INFO dspy.teleprompt.bootstrap_finetune: Job 1/1 is done
2025/02/20 09:16:33 INFO dspy.teleprompt.bootstrap_finetune: Updating the student program with the fine-tuned LMs...
2025/02/20 09:16:33 INFO dspy.teleprompt.bootstrap_finetune: BootstrapFinetune has finished compiling the student program
Steps to reproduce
Example Code:
importollamaimportdspyfromdspy.clients.lm_localimportLocalProviderollama_client=ollama.Client(host="http://ollama:11434")
teacher_lm=dspy.LM(
model="ollama_chat/mixtral",
model_type="chat",
api_base="http://ie-ollama:11434",
max_tokens=4096,
)
student_lm_name="meta-llama/Llama-3.2-1B-Instruct"student_lm=dspy.LM(
model=f"openai/local:{student_lm_name}", provider=LocalProvider(), max_tokens=4096
)
classifier=dspy.ChainOfThought(YourSignature)
student_classify=classifier.deepcopy()
student_classify.set_lm(student_lm)
teacher_classify=classifier.deepcopy()
teacher_classify.set_lm(teacher_lm)
dspy.settings.experimental=Trueoptimizer=dspy.BootstrapFinetune(
num_threads=16
) # if you *do* have labels, pass metric=your_metric here!classify_ft=optimizer.compile(
student=student_classify, teacher=teacher_classify, trainset=unlabeled_trainset[0:10]
)
classify_ft.get_lm() # -> NameError("name 'max_seq_length' is not defined")
Is this a bug in dspy, or wrong usage of the finetuning methods?
DSPy version
2.6.5
The text was updated successfully, but these errors were encountered:
What happened?
I noticed, that when running finetuning with dspy 2.6.5 i run into:
This happend, when using Mixtral hosted on OllamaProvider with a local meta-llama/Llama-3.2-1B-Instruct via LocalProvider.
The compile step fails after throwing the following output:
Steps to reproduce
Example Code:
Is this a bug in dspy, or wrong usage of the finetuning methods?
DSPy version
2.6.5
The text was updated successfully, but these errors were encountered: