Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Improve clarity about LLM configs in the documentation #7808

Open
1 of 2 tasks
Devy99 opened this issue Feb 14, 2025 · 6 comments
Open
1 of 2 tasks

[Feature] Improve clarity about LLM configs in the documentation #7808

Devy99 opened this issue Feb 14, 2025 · 6 comments
Labels
enhancement New feature or request

Comments

@Devy99
Copy link

Devy99 commented Feb 14, 2025

What feature would you like to see?

Currently, it is possible to setup configurations such as the temperature of the LLM both via dspy.LM(...) (documentation link) and dspy.Predict() (and for the other modules too, documentation link). However, from the documentation it is unclear what happens when we set different temperatures using both approaches.

For example:

import dspy

# Setting the temperature to 0.9 at LLM initialization
lm = dspy.LM('openai/gpt-4o-mini', temperature=0.9)
dspy.configure(lm=lm)

sentence = "it's a charming and often affecting journey."  # example from the SST-2 dataset.

# Setting the temperature to 0.2 at Module initialization
classify = dspy.Predict('sentence -> sentiment', temperature=0.2)

In the previous example, I would say that setting the temperature in the dspy.Predict module will override the initial configuration of the LLM. However, looking at the Predict class (source code) in the source code it seems the opposite (I also ask for confirmation)

def forward(self, **kwargs):
       import dspy

       # Extract the three privileged keyword arguments.
       assert "new_signature" not in kwargs, "new_signature is no longer a valid keyword argument."
       signature = ensure_signature(kwargs.pop("signature", self.signature))
       demos = kwargs.pop("demos", self.demos)

       # Here, the configurations (self.config) provided in the Predict constructor are merged 
       # with those in kwargs["config"] and overrided with the latter, if already specified.
       config = dict(**self.config, **kwargs.pop("config", {}))

Is it possible to better clarify this scenario?

Would you like to contribute?

  • Yes, I'd like to help implement this.
  • No, I just want to request it.

Additional Context

No response

@Devy99 Devy99 added the enhancement New feature or request label Feb 14, 2025
@Devy99
Copy link
Author

Devy99 commented Feb 17, 2025

@okhat , sorry to bother you, can I ask (for confirmation) whether in the provided example, the temperature set in the LLM configuration overrides that provided in the module?

@xaviermehaut
Copy link

have you tried :
with dspy.context(lm=dspy.LM('openai/gpt-4o-mini', temperature=0.9)) :

@Devy99
Copy link
Author

Devy99 commented Feb 17, 2025

@xaviermehaut not yet. My current use case consists of creating a custom module with several Predict / ChainOfThought modules, using the same LLM but with different temperatures.
Imagine something like this:

class CustomModule(dspy.Module):
    def __init__(self):
         self.classify = dspy.Predict('sentence -> sentiment', temperature=0)
         self.answer = dspy.ChainOfThought('question -> answer', temperature=0.2)
         ...

However, (from the documentation) it is not clear what happens when I set the temperature from the LLM configuration, like:
lm = dspy.LM('openai/gpt-4o-mini', temperature=0.2)

In this case, when I use self.classify, does it use a temperature of 0 or 0.2?

Right now, I am setting the temperature only in the single modules (i.e., self.answer = dspy.ChainOfThought('question -> answer', temperature=0.2)), but I am also curious to know which is the configuration priority.

@xaviermehaut
Copy link

xaviermehaut commented Feb 17, 2025 via email

@chenmoneygithub
Copy link
Collaborator

@Devy99 Thanks for reporting the issue! It's truly confusing that we allow settings at different layers, and both at construction and call time.

To your original question, your pasted code config = dict(**self.config, **kwargs.pop("config", {})) is not related to the LM you set, it means your call time args in dspy.Predict overrides the value set in constructor. And the final result will go ahead override the one you set in dspy.LM:

kwargs = {**self.kwargs, **kwargs}

@Devy99
Copy link
Author

Devy99 commented Feb 19, 2025

@chenmoneygithub thanks!

@Devy99 Devy99 closed this as completed Feb 19, 2025
@okhat okhat reopened this Feb 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants