Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

google-common: Support Gemini 2.0 Thinking model #7434

Open
1 task done
afirstenberg opened this issue Dec 27, 2024 · 5 comments
Open
1 task done

google-common: Support Gemini 2.0 Thinking model #7434

afirstenberg opened this issue Dec 27, 2024 · 5 comments
Labels
auto:improvement Medium size change to existing code to handle new use-cases

Comments

@afirstenberg
Copy link
Contributor

Privileged issue

  • I am a LangChain maintainer, or was asked directly by a LangChain maintainer to create an issue here.

Issue Content

Note this is NOT about general Gemini 2.0 compatibility. This is specifically about the "gemini-2.0-flash-thinking-exp" model (and related "reasoning" models).

There are some issues

  • The "reasoning tokens" are available in the output, but they are a different text part than the final result
    • So part[0] contains the reasoning and part[1] contains the final result
    • If using API version "v1alpha" (at least on AI Studio), there is a "thinking" attribute which is available to indicate which part is the reasoning and which part is final
    • The current logic tries to merge text parts together. This is, arguably, undesirable. At the very least - it breaks some of the tests.
    • We need to see how this works with streaming.
    • We need to discuss the best way to handle this in general - not just for Gemini.
  • Currently, this model doesn't support tools.

My thinking about handling the reasoning tokens:

  • For compatibility with other models - they should be removed from normal output.
  • They should be included in the response_metadata as a "reasoning" attribute, or something similar.
  • We should have a filter that can, optionally, put them back together.
  • Should any of this be controlled with a parameter?
  • I don't think this needs to be a new class.
@dosubot dosubot bot added the auto:improvement Medium size change to existing code to handle new use-cases label Dec 27, 2024
@shlroland
Copy link

Furthermore, if structured output is required, it cannot use function calling like the original vertexai provider; instead, the schema should be passed in the responseSchema, and this schema format does not support zodSchema as well as complete json schema.

@afirstenberg
Copy link
Contributor Author

@shlroland - Just making sure I understand what you're saying. Is that aligned with the issue that was discussed in #7401?

@shlroland
Copy link

shlroland commented Jan 15, 2025

Yes, The content discussed in #7401 is exactly what I mean. Have you started the adaptation development for the built-in structured output of gemini?

@afirstenberg
Copy link
Contributor Author

I haven't yet. If you want to take it on - leave a note on #7401 and you're welcome to do so!
Happy to provide guidance, but it sounds like you have the concept well in hand.

@shlroland
Copy link

I haven't yet. If you want to take it on - leave a note on #7401 and you're welcome to do so! Happy to provide guidance, but it sounds like you have the concept well in hand.

I have just started using langchain for some development work. I found that I cannot use the native structured output of gemini. I looked at the relevant code in langchain and found the approximate problem. If I have time, I will try to make some contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:improvement Medium size change to existing code to handle new use-cases
Projects
None yet
Development

No branches or pull requests

2 participants