Skip to content

[C API] PyUnicodeWriter_WriteStr() and str subclasses #148250

@vstinner

Description

@vstinner

The PyUnicodeWriter_WriteStr() API is documented as:

Call PyObject_Str() on obj and write the output into writer.

On strings (str objects), it works as expected. But on str subclasses, the result can be surprising if the subclass overrides __str__() to return a different string. Example with the enum module from issue gh-148241:

from enum import Enum

class AStringEnum(str, Enum):
    A = "a"

obj = AStringEnum.A
print(f"{str(obj)=!r}")
print(f"{str.__str__(obj)=!r}")

Output (on Python 3.15):

str(obj)='AStringEnum.A'
str.__str__(obj)='a'

str() and str.__str__() return different strings (AStringEnum.A vs a). PyUnicodeWriter_WriteStr() writes AStringEnum.A because it uses str().

One way to get the a string is to call PyUnicode_FromObject() to convert the str subclass to a str object, and then call PyUnicodeWriter_WriteStr() on it.

An alternative would be to provide a new PyUnicodeWriter function to write str subclasses, but I don't think that it's worth it. So I suppose that it's just a documentation issue (the doc should suggest using PyUnicode_FromObject()).

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions