Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Finetuning fails with ERROR dspy.clients.lm: name 'max_seq_length' is not defined #7821

Open
ratzrattillo opened this issue Feb 20, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@ratzrattillo
Copy link

What happened?

I noticed, that when running finetuning with dspy 2.6.5 i run into:

ERROR dspy.clients.lm: name 'max_seq_length' is not defined

This happend, when using Mixtral hosted on OllamaProvider with a local meta-llama/Llama-3.2-1B-Instruct via LocalProvider.
The compile step fails after throwing the following output:

2025/02/20 09:16:28 INFO dspy.teleprompt.bootstrap_finetune: Preparing the student and teacher programs...
2025/02/20 09:16:28 INFO dspy.teleprompt.bootstrap_finetune: Bootstrapping data...
Average Metric: 10.00 / 10 (100.0%): 100%|██████████| 10/10 [00:00<00:00, 74.34it/s]
2025/02/20 09:16:28 INFO dspy.evaluate.evaluate: Average Metric: 10 / 10 (100.0%)
2025/02/20 09:16:28 INFO dspy.teleprompt.bootstrap_finetune: Preparing the train data...
2025/02/20 09:16:28 INFO dspy.teleprompt.bootstrap_finetune: Using 10 data points for fine-tuning the model: openai/local:meta-llama/Llama-3.2-1B-Instruct
2025/02/20 09:16:28 INFO dspy.teleprompt.bootstrap_finetune: Starting LM fine-tuning...
2025/02/20 09:16:28 INFO dspy.teleprompt.bootstrap_finetune: 1 fine-tuning job(s) to start
2025/02/20 09:16:28 INFO dspy.teleprompt.bootstrap_finetune: Starting 1 fine-tuning job(s)...
2025/02/20 09:16:28 INFO dspy.teleprompt.bootstrap_finetune: Calling lm.kill() on the LM to be fine-tuned to free up resources. This won't have any effect if the LM is not running.
2025/02/20 09:16:28 INFO dspy.clients.lm_local: No running server to kill.
2025/02/20 09:16:28 INFO dspy.clients.lm_local: Starting local training, will save to /home/vilocify/.dspy_cache/finetune/ab31db9fad110d48_prngex_meta-llama-Llama-3.2-1B-Instruct_2025-02-20_09-16-28

2025/02/20 09:16:31 INFO dspy.clients.lm_local: Using device: cuda
2025/02/20 09:16:33 INFO dspy.clients.lm_local: Adding pad token to tokenizer
2025/02/20 09:16:33 INFO dspy.clients.lm_local: Creating dataset
Map:   0%
 0/10 [00:00<?, ? examples/s]
2025/02/20 09:16:33 ERROR dspy.clients.lm: name 'max_seq_length' is not defined
2025/02/20 09:16:33 INFO dspy.teleprompt.bootstrap_finetune: Job 1/1 is done
2025/02/20 09:16:33 INFO dspy.teleprompt.bootstrap_finetune: Updating the student program with the fine-tuned LMs...
2025/02/20 09:16:33 INFO dspy.teleprompt.bootstrap_finetune: BootstrapFinetune has finished compiling the student program

Steps to reproduce

Example Code:

import ollama
import dspy
from dspy.clients.lm_local import LocalProvider

ollama_client = ollama.Client(host="http://ollama:11434")

teacher_lm = dspy.LM(
    model="ollama_chat/mixtral",
    model_type="chat",
    api_base="http://ie-ollama:11434",
    max_tokens=4096,
)

student_lm_name = "meta-llama/Llama-3.2-1B-Instruct"
student_lm = dspy.LM(
    model=f"openai/local:{student_lm_name}", provider=LocalProvider(), max_tokens=4096
)

classifier = dspy.ChainOfThought(YourSignature)

student_classify = classifier.deepcopy()
student_classify.set_lm(student_lm)

teacher_classify = classifier.deepcopy()
teacher_classify.set_lm(teacher_lm)

dspy.settings.experimental = True

optimizer = dspy.BootstrapFinetune(
    num_threads=16
)  # if you *do* have labels, pass metric=your_metric here!
classify_ft = optimizer.compile(
    student=student_classify, teacher=teacher_classify, trainset=unlabeled_trainset[0:10]
)

classify_ft.get_lm() # -> NameError("name 'max_seq_length' is not defined")

Is this a bug in dspy, or wrong usage of the finetuning methods?

DSPy version

2.6.5

@ratzrattillo ratzrattillo added the bug Something isn't working label Feb 20, 2025
@khankamranali
Copy link

I am also getting same error with version 2.6.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants