google-common: Support Gemini 2.0 Thinking model #7434

afirstenberg · 2024-12-27T01:31:07Z

Privileged issue

I am a LangChain maintainer, or was asked directly by a LangChain maintainer to create an issue here.

Issue Content

Note this is NOT about general Gemini 2.0 compatibility. This is specifically about the "gemini-2.0-flash-thinking-exp" model (and related "reasoning" models).

There are some issues

The "reasoning tokens" are available in the output, but they are a different text part than the final result
- So part[0] contains the reasoning and part[1] contains the final result
- If using API version "v1alpha" (at least on AI Studio), there is a "thinking" attribute which is available to indicate which part is the reasoning and which part is final
- The current logic tries to merge text parts together. This is, arguably, undesirable. At the very least - it breaks some of the tests.
- We need to see how this works with streaming.
- We need to discuss the best way to handle this in general - not just for Gemini.
Currently, this model doesn't support tools.

My thinking about handling the reasoning tokens:

For compatibility with other models - they should be removed from normal output.
They should be included in the response_metadata as a "reasoning" attribute, or something similar.
We should have a filter that can, optionally, put them back together.
Should any of this be controlled with a parameter?
I don't think this needs to be a new class.

The text was updated successfully, but these errors were encountered:

shlroland · 2025-01-14T07:35:46Z

Furthermore, if structured output is required, it cannot use function calling like the original vertexai provider; instead, the schema should be passed in the responseSchema, and this schema format does not support zodSchema as well as complete json schema.

afirstenberg · 2025-01-14T21:48:21Z

@shlroland - Just making sure I understand what you're saying. Is that aligned with the issue that was discussed in #7401?

shlroland · 2025-01-15T01:47:31Z

Yes, The content discussed in #7401 is exactly what I mean. Have you started the adaptation development for the built-in structured output of gemini?

afirstenberg · 2025-01-15T10:37:59Z

I haven't yet. If you want to take it on - leave a note on #7401 and you're welcome to do so!
Happy to provide guidance, but it sounds like you have the concept well in hand.

shlroland · 2025-01-16T03:41:46Z

I haven't yet. If you want to take it on - leave a note on #7401 and you're welcome to do so! Happy to provide guidance, but it sounds like you have the concept well in hand.

I have just started using langchain for some development work. I found that I cannot use the native structured output of gemini. I looked at the relevant code in langchain and found the approximate problem. If I have time, I will try to make some contributions.

afirstenberg mentioned this issue Dec 27, 2024

fix(google-common,google-*) Gemini 2.0 support #7435

Merged

dosubot bot added the auto:improvement Medium size change to existing code to handle new use-cases label Dec 27, 2024

afirstenberg mentioned this issue Jan 25, 2025

Add "reasoning" attribute to AIMessage #7600

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

google-common: Support Gemini 2.0 Thinking model #7434

google-common: Support Gemini 2.0 Thinking model #7434

afirstenberg commented Dec 27, 2024

shlroland commented Jan 14, 2025

afirstenberg commented Jan 14, 2025

shlroland commented Jan 15, 2025 •

edited

Loading

afirstenberg commented Jan 15, 2025

shlroland commented Jan 16, 2025

google-common: Support Gemini 2.0 Thinking model #7434

google-common: Support Gemini 2.0 Thinking model #7434

Comments

afirstenberg commented Dec 27, 2024

Privileged issue

Issue Content

shlroland commented Jan 14, 2025

afirstenberg commented Jan 14, 2025

shlroland commented Jan 15, 2025 • edited Loading

afirstenberg commented Jan 15, 2025

shlroland commented Jan 16, 2025

shlroland commented Jan 15, 2025 •

edited

Loading