Handle multi-byte decode errors in as_string/as_bytes by krrishapatel · Pull Request #1721 · Supervisor/supervisor

krrishapatel · 2026-06-29T05:22:33Z

Summary

When tail -f reads an initial chunk from a log file, the byte boundary can split a multi-byte UTF-8 character (e.g. Korean, Chinese, Japanese). This causes as_string() to either raise UnicodeDecodeError or corrupt the entire chunk.

Adding errors='replace' to the encode()/decode() calls in compat.py ensures that only the truncated character at the boundary shows as �, while the rest of the text decodes correctly.

Fixes #1693

Changes

supervisor/compat.py: Add errors='replace' to all encode/decode calls (both Python 2 and 3 branches)
supervisor/tests/test_compat.py: New test file covering valid UTF-8, Korean text, and incomplete byte sequences

When tail -f reads an initial chunk from a log file, the byte boundary can split a multi-byte UTF-8 character (e.g. Korean). This causes the entire chunk to decode incorrectly. Add errors='replace' to encode/decode calls in compat.py so that incomplete byte sequences produce the Unicode replacement character instead of corrupting the rest of the output.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Handle multi-byte decode errors in as_string/as_bytes#1721

Handle multi-byte decode errors in as_string/as_bytes#1721
krrishapatel wants to merge 1 commit into
Supervisor:mainfrom
krrishapatel:fix/multibyte-decode-error-handling

krrishapatel commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

krrishapatel commented Jun 29, 2026

Summary

Changes

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant