Skip to content

Python: Fix Bedrock non-ASCII escaping in JSON content blocks#6628

Open
kimnamu wants to merge 1 commit into
microsoft:mainfrom
kimnamu:fix/bedrock-non-ascii
Open

Python: Fix Bedrock non-ASCII escaping in JSON content blocks#6628
kimnamu wants to merge 1 commit into
microsoft:mainfrom
kimnamu:fix/bedrock-non-ascii

Conversation

@kimnamu

@kimnamu kimnamu commented Jun 19, 2026

Copy link
Copy Markdown

Thanks for Agent Framework and the Bedrock integration — it's a pleasure to build on.

Closes #6627

Problem

When the Bedrock Converse API returns a structured json content block, BedrockChatClient._parse_message_contents serializes it to text with json.dumps(json_value). Since json.dumps defaults to ensure_ascii=True, non-ASCII characters (CJK, emoji, accented Latin, etc.) are escaped to \uXXXX and reach the user garbled.

Cause

This single call is the outlier. The sibling OpenAI client (python/packages/openai/agent_framework_openai/_chat_client.py) and 16+ other call sites across the repo already serialize user-facing / span data with ensure_ascii=False. PR #3894 ("Python: Fix non-ascii chars in span attributes") fixed the same class of issue in observability — this is the matching fix the Bedrock client was missing.

Change

One line: add ensure_ascii=False to the json.dumps for the Bedrock json content block, plus a regression test.

Before / After

Item Before After
json block {"greeting": "你好世界"} → text {"greeting": "你好世界"} {"greeting": "你好世界"}
Emoji "🎉" in json block "🎉" "🎉"
Text content blocks (block.get("text")) ✅ unchanged ✅ unchanged
Public API / method signatures / arguments ✅ unchanged ✅ unchanged
Output remains valid JSON (round-trips via json.loads)

Test (red → green)

New test test_process_converse_response_preserves_non_ascii_in_json_block uses the existing key-free _StubBedrockRuntime pattern.

Against the unpatched source (bug present):

>       assert "你好世界" in text
E       assert '你好世界' in '{"greeting": "\\u4f60\\u597d\\u4e16\\u754c", "emoji": "\\ud83c\\udf89"}'
FAILED packages/bedrock/tests/test_bedrock_client.py::test_process_converse_response_preserves_non_ascii_in_json_block
======================== 1 failed, 2 warnings in 0.29s =========================

With the fix:

packages/bedrock/tests/test_bedrock_client.py .                          [100%]
======================== 1 passed, 2 warnings in 1.88s =========================

Full bedrock suite (no regressions) + lint/format clean:

$ uv run --package agent-framework-bedrock pytest packages/bedrock/tests/ -m "not integration" -q
..................................                                       [100%]
$ uv run ruff format --check packages/bedrock/...   # 2 files already formatted
$ uv run ruff check packages/bedrock/...            # All checks passed!

This contribution was prepared with the help of an AI agent (Claude Code); a human reviewed the change, rationale, and test results before submitting.

The Bedrock Converse `json` content block was serialized with
`json.dumps(json_value)`, whose default `ensure_ascii=True` escapes
CJK/emoji/accented characters to `\uXXXX` and surfaces garbled text.
Add `ensure_ascii=False` to match the sibling OpenAI client and the
16+ other call sites across the repo. Includes a regression test.

Closes microsoft#6627

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 19, 2026 15:14
@moonbox3 moonbox3 added the python Issues related to the Python codebase label Jun 19, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a Bedrock Converse response parsing edge case where structured json content blocks were being serialized with json.dumps(..., ensure_ascii=True) (default), causing non‑ASCII characters to be escaped into \uXXXX sequences in user-visible text.

Changes:

  • Update Bedrock json content block serialization to use ensure_ascii=False so non‑ASCII characters are preserved.
  • Add a regression test ensuring CJK text and emoji remain unescaped and the emitted text still round-trips via json.loads.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
python/packages/bedrock/agent_framework_bedrock/_chat_client.py Preserves non‑ASCII characters when converting Bedrock json content blocks into text.
python/packages/bedrock/tests/test_bedrock_client.py Adds a unit test verifying non‑ASCII preservation and preventing regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Issues related to the Python codebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: Bedrock JSON content blocks escape non-ASCII characters to \uXXXX

3 participants