Python: surface Gemini cached and thinking token counts in usage details by he-yufeng · Pull Request #6638 · microsoft/agent-framework

he-yufeng · 2026-06-20T00:07:22Z

Motivation & Context

The Gemini chat client only surfaces input, output, and total token counts in usage_details. Gemini's GenerateContentResponseUsageMetadata also reports cached_content_token_count (tokens served from context cache) and thoughts_token_count (tokens spent thinking by reasoning models), and _parse_usage drops both. For cached prompts and thinking models, cache and reasoning usage silently read as zero, which throws off cost and token accounting.

UsageDetails already defines canonical fields for these (cache_read_input_token_count, reasoning_output_token_count), and the OpenAI and Anthropic connectors already populate them, so Gemini was the odd one out.

Description & Review Guide

What are the major changes? RawGeminiChatClient._parse_usage now maps cached_content_token_count to cache_read_input_token_count and thoughts_token_count to reasoning_output_token_count, following the same is not None guard pattern as the existing three fields.
What is the impact of these changes? Cache-read and reasoning token counts are now reported for Gemini, consistent with the OpenAI and Anthropic connectors. Responses that omit these fields are unchanged (the values stay unset).
What do you want reviewers to focus on? That the source field names match google-genai's usage metadata and the target keys match the UsageDetails contract.

Added test_get_response_usage_details_includes_cached_and_reasoning_tokens and extended the _make_response test helper with the two fields. The full test_gemini_client.py suite passes locally (113 passed, 8 integration skipped).

Related Issue

Fixes #6637

Contribution Checklist

The code builds clean without any errors or warnings
All unit tests pass, and I have added new tests where possible
The PR follows the Contribution Guidelines
This PR is linked to an issue and there is no other open PR for this issue.
This is not a breaking change.

Copilot

Pull request overview

This PR updates the Python Gemini connector to surface additional token-usage metadata (cache-read and reasoning/thinking tokens) into the framework’s canonical UsageDetails fields, bringing Gemini in line with the existing OpenAI and Anthropic connectors for more accurate accounting.

Changes:

Map GenerateContentResponseUsageMetadata.cached_content_token_count → usage_details["cache_read_input_token_count"].
Map GenerateContentResponseUsageMetadata.thoughts_token_count → usage_details["reasoning_output_token_count"].
Extend the Gemini client unit tests to include these new usage fields.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`python/packages/gemini/agent_framework_gemini/_chat_client.py`	Adds the two missing usage metadata mappings into `_parse_usage`.
`python/packages/gemini/tests/test_gemini_client.py`	Updates the response helper to include cached/thinking token fields and adds a test validating the new usage keys.

+async def test_get_response_usage_details_includes_cached_and_reasoning_tokens() -> None:
+    """Surfaces Gemini cached-content and thinking token counts into the canonical usage fields."""
+    client, mock = _make_gemini_client()
+    mock.aio.models.generate_content = AsyncMock(
+        return_value=_make_response(
+            [_make_part(text="Hi")],
+            prompt_tokens=20,
+            output_tokens=8,
+            total_tokens=28,
+            cached_tokens=12,
+            thoughts_tokens=6,
+        )
+    )
+
+    response = await client.get_response(messages=[Message(role="user", contents=[Content.from_text("Hi")])])
+
+    assert response.usage_details is not None
+    assert response.usage_details["cache_read_input_token_count"] == 12
+    assert response.usage_details["reasoning_output_token_count"] == 6
+
+


Python: surface Gemini cached and thinking token counts in usage details

748da1d

Copilot AI review requested due to automatic review settings June 20, 2026 00:07

moonbox3 added the python Issues related to the Python codebase label Jun 20, 2026

Copilot started reviewing on behalf of he-yufeng June 20, 2026 00:07 View session

Copilot AI reviewed Jun 20, 2026

View reviewed changes

he-yufeng mentioned this pull request Jun 20, 2026

Python: surface Bedrock cache token counts in usage details #6640

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: surface Gemini cached and thinking token counts in usage details#6638

Python: surface Gemini cached and thinking token counts in usage details#6638
he-yufeng wants to merge 1 commit into
microsoft:mainfrom
he-yufeng:fix/gemini-usage-cache-reasoning-tokens

he-yufeng commented Jun 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

he-yufeng commented Jun 20, 2026

Motivation & Context

Description & Review Guide

Related Issue

Contribution Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants