-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
GH-133711: Enable UTF-8 mode by default (PEP 686) #133712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
fc19177
c706cfa
f095ab0
ff066b1
abfb864
f40b986
3226e1d
b83c966
b56e01b
3da6bf2
9b3ebf7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -75,11 +75,35 @@ New features | |
Other language changes | ||
====================== | ||
|
||
* Python now uses UTF-8_ as the default encoding, independent of the system's | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You might mention the UTF-8 Mode earlier since it has other side effects documented in the UTF-8 Mode section, such as changing sys.stdout error handler and ignoring the locale encoding. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "default encoding" is UTF-8 since Python 3.0. (e.g. |
||
environment. This means that I/O operations without an explicit encoding, | ||
e.g. ``open('flying-circus.txt')``, will use UTF-8. | ||
UTF-8 is a widely-supported Unicode_ character encoding that has become a | ||
*de facto* standard for representing text, including nearly every webpage | ||
on the internet, many common file formats, programming languages, and more. | ||
|
||
This only applies when no ``encoding`` argument is given. For best | ||
compatibility between versions of Python, ensure that an explicit ``encoding`` | ||
argument is always provided. The :ref:`opt-in encoding warning <io-encoding-warning>` | ||
can be used to identify code that may be affected by this change. | ||
The special special ``encoding='locale'`` argument uses the current locale | ||
encoding, and has been supported since Python 3.10. | ||
|
||
To retain the previous behaviour, Python's UTF-8 mode may be disabled with | ||
the :envvar:`PYTHONUTF8=0 <PYTHONUTF8>` environment variable or the | ||
:option:`-X utf8=0 <-X>` command line option. | ||
|
||
.. seealso:: :pep:`686` for further details. | ||
|
||
.. _UTF-8: https://en.wikipedia.org/wiki/UTF-8 | ||
.. _Unicode: https://home.unicode.org/ | ||
|
||
(Contributed by Adam Turner in :gh:`133711`; PEP 686 written by Inada Naoki.) | ||
|
||
* Several error messages incorrectly using the term "argument" have been corrected. | ||
(Contributed by Stan Ulbrych in :gh:`133382`.) | ||
|
||
|
||
|
||
New modules | ||
=========== | ||
|
||
|
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
@@ -102,15 +102,14 @@ typedef struct PyPreConfig { | |||||||||
|
||||||||||
/* Enable UTF-8 mode? (PEP 540) | ||||||||||
|
||||||||||
Disabled by default (equals to 0). | ||||||||||
If equal to 1, use the UTF-8 encoding and use "surrogateescape" for the | ||||||||||
stdin & stdout error handlers. | ||||||||||
|
||||||||||
Set to 1 by "-X utf8" and "-X utf8=1" command line options. | ||||||||||
Set to 1 by PYTHONUTF8=1 environment variable. | ||||||||||
Enabled by default (equal to 1; PEP 686), or if Py_UTF8Mode=1, | ||||||||||
or if "-X utf8=1" or PYTHONUTF8=1. | ||||||||||
Comment on lines
+108
to
+109
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
|
||||||||||
Set to 0 by "-X utf8=0" and PYTHONUTF8=0. | ||||||||||
|
||||||||||
If equals to -1, it is set to 1 if the LC_CTYPE locale is "C" or | ||||||||||
"POSIX", otherwise it is set to 0. Inherit Py_UTF8Mode value value. */ | ||||||||||
Set to 0 by "-X utf8=0" or PYTHONUTF8=0. | ||||||||||
*/ | ||||||||||
int utf8_mode; | ||||||||||
|
||||||||||
/* If non-zero, enable the Python Development Mode. | ||||||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
Implement :pep:`686`: Enable :ref:`Python UTF-8 Mode <utf8-mode>` by | ||
default. Patch by Adam Turner. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.