@@ -124,7 +124,7 @@ can cause problems in some situations (for example, when using the GNU readline
124
124
module [16_]).
125
125
126
126
On non-Apple and non-Android \*nix systems, these operations are handled using
127
- the C locale system in glibc, which has the following characteristics [4_] :
127
+ the C locale system in glibc, which has the following characteristics [4]_ :
128
128
129
129
* by default, all processes start in the ``C`` locale, which uses ``ASCII``
130
130
for these conversions. This is almost never what anyone doing multilingual
@@ -136,7 +136,7 @@ the C locale system in glibc, which has the following characteristics [4_]:
136
136
137
137
The specific locale category that covers the APIs that CPython depends on is
138
138
``LC_CTYPE``, which applies to "classification and conversion of characters,
139
- and to multibyte and wide characters" [5_] . Accordingly, CPython includes the
139
+ and to multibyte and wide characters" [5]_ . Accordingly, CPython includes the
140
140
following key calls to ``setlocale``:
141
141
142
142
* in the main ``python`` binary, CPython calls ``setlocale(LC_ALL, "")`` to
@@ -183,7 +183,7 @@ Mac OS X and other \*BSD systems have taken a different approach: instead of
183
183
offering a ``C.UTF-8`` locale, they offer a partial ``UTF-8`` locale that only
184
184
defines the ``LC_CTYPE`` category. On such systems, the preferred
185
185
environmental locale adjustment is to set ``LC_CTYPE=UTF-8`` rather than to set
186
- ``LC_ALL`` or ``LANG``. [17_]
186
+ ``LC_ALL`` or ``LANG``. [17]_
187
187
188
188
In the specific case of Docker containers and similar technologies, the
189
189
appropriate locale setting can be specified directly in the container image
@@ -247,8 +247,8 @@ Motivation
247
247
While Linux container technologies like Docker, Kubernetes, and OpenShift are
248
248
best known for their use in web service development, the related container
249
249
formats and execution models are also being adopted for Linux command line
250
- application development. Technologies like Gnome Flatpak [7_] and
251
- Ubuntu Snappy [8_] further aim to bring these same techniques to Linux GUI
250
+ application development. Technologies like Gnome Flatpak [7]_ and
251
+ Ubuntu Snappy [8]_ further aim to bring these same techniques to Linux GUI
252
252
application development.
253
253
254
254
When using Python 3 for application development in these contexts, it isn't
@@ -327,7 +327,7 @@ with this problem automatically rather than relying on redistributors or end
327
327
users to handle it through system configuration changes.
328
328
329
329
While the glibc developers are working towards making the C.UTF-8 locale
330
- universally available for use by glibc based applications like CPython [6_] ,
330
+ universally available for use by glibc based applications like CPython [6]_ ,
331
331
this unfortunately doesn't help on platforms that ship older versions of glibc
332
332
without that feature, and also don't provide C.UTF-8 (or an equivalent) as an
333
333
on-disk locale the way Debian and Fedora do. These platforms are considered
@@ -649,7 +649,7 @@ Defaulting to "surrogateescape" error handling on the standard IO streams
649
649
By coercing the locale away from the legacy C default and its assumption of
650
650
ASCII as the preferred text encoding, this PEP also disables the implicit use
651
651
of the "surrogateescape" error handler on the standard IO streams that was
652
- introduced in Python 3.5 ([15_] ), as well as the automatic use of
652
+ introduced in Python 3.5 ([15]_ ), as well as the automatic use of
653
653
``surrogateescape`` when operating in :pep:`540`'s proposed UTF-8 mode.
654
654
655
655
Rather than introducing yet another configuration option to adjust that
@@ -662,7 +662,7 @@ provided text values are typically able to be transparently passed through a
662
662
Python 3 application even if it is incorrect in assuming that that text has
663
663
been encoded as UTF-8.
664
664
665
- In particular, GB 18030 [12_] is a Chinese national text encoding standard
665
+ In particular, GB 18030 [12]_ is a Chinese national text encoding standard
666
666
that handles all Unicode code points, that is formally incompatible with both
667
667
ASCII and UTF-8, but will nevertheless often tolerate processing as surrogate
668
668
escaped data - the points where GB 18030 reuses ASCII byte values in an
@@ -672,7 +672,7 @@ the relevant ASCII code points. Operations that don't involve splitting on or
672
672
searching for particular ASCII or Unicode code point values are almost
673
673
certain to work correctly.
674
674
675
- Similarly, Shift-JIS [13_] and ISO-2022-JP [14_] remain in widespread use in
675
+ Similarly, Shift-JIS [13]_ and ISO-2022-JP [14]_ remain in widespread use in
676
676
Japan, and are incompatible with both ASCII and UTF-8, but will tolerate text
677
677
processing operations that don't involve splitting on or searching for
678
678
particular ASCII or Unicode code point values.
@@ -908,7 +908,7 @@ This was later removed on the grounds that setting only ``LC_CTYPE`` is
908
908
sufficient to handle all of the problematic scenarios that the PEP aimed
909
909
to resolve, while setting ``LANG`` as well would break cases where ``LANG``
910
910
was set correctly, and the locale problems were solely due to an incorrect
911
- ``LC_CTYPE`` setting ([22_] ).
911
+ ``LC_CTYPE`` setting ([22]_ ).
912
912
913
913
For example, consider a Python application that called the Linux ``date``
914
914
utility in a subprocess rather than doing its own date formatting::
@@ -1077,7 +1077,7 @@ be entirely redundant.
1077
1077
However, that assumption turned out to be incorrect, as subsequent
1078
1078
investigations showed that if you explicitly configure ``LANG=C`` on
1079
1079
these platforms, extension modules like GNU readline will misbehave in much the
1080
- same way as they do on other \*nix systems. [21_]
1080
+ same way as they do on other \*nix systems. [21]_
1081
1081
1082
1082
In addition, Mac OS X is also frequently used as a development and testing
1083
1083
platform for Python software intended for deployment to other \*nix environments
@@ -1093,12 +1093,12 @@ Implementation
1093
1093
==============
1094
1094
1095
1095
The reference implementation is being developed in the
1096
- ``pep538-coerce-c-locale`` feature branch [18_] in Nick Coghlan's fork of the
1097
- CPython repository on GitHub. A work-in-progress PR is available at [20_] .
1096
+ ``pep538-coerce-c-locale`` feature branch [18]_ in Nick Coghlan's fork of the
1097
+ CPython repository on GitHub. A work-in-progress PR is available at [20]_ .
1098
1098
1099
1099
This reference implementation covers not only the enhancement request in
1100
- issue 28180 [1_] , but also the Android compatibility fixes needed to resolve
1101
- issue 28997 [16_] .
1100
+ issue 28180 [1]_ , but also the Android compatibility fixes needed to resolve
1101
+ issue 28997 [16]_ .
1102
1102
1103
1103
1104
1104
Backporting to earlier Python 3 releases
@@ -1115,7 +1115,7 @@ default, or else specifically for platforms where such a locale is already
1115
1115
consistently available.
1116
1116
1117
1117
At least the Fedora project is planning to pursue this approach for the
1118
- upcoming Fedora 26 release [19_] .
1118
+ upcoming Fedora 26 release [19]_ .
1119
1119
1120
1120
1121
1121
Backporting to other 3.x releases
@@ -1139,7 +1139,7 @@ Acknowledgements
1139
1139
1140
1140
The locale coercion approach proposed in this PEP is inspired directly by
1141
1141
Armin Ronacher's handling of this problem in the ``click`` command line
1142
- utility development framework [2_] ::
1142
+ utility development framework [2]_ ::
1143
1143
1144
1144
$ LANG=C python3 -c 'import click; cli = click.command()(lambda:None); cli()'
1145
1145
Traceback (most recent call last):
@@ -1157,18 +1157,18 @@ utility development framework [2_]::
1157
1157
export LANG=C.UTF-8
1158
1158
1159
1159
The change was originally proposed as a downstream patch for Fedora's
1160
- system Python 3.6 package [3_] , and then reformulated as a PEP for Python 3.7
1160
+ system Python 3.6 package [3]_ , and then reformulated as a PEP for Python 3.7
1161
1161
with a section allowing for backports to earlier versions by redistributors.
1162
1162
In parallel with the development of the upstream patch, Charalampos Stratakis
1163
1163
has been working on the Fedora 26 backport and providing feedback on the
1164
1164
practical viability of the proposed changes.
1165
1165
1166
- The initial draft was posted to the Python Linux SIG for discussion [10_] and
1166
+ The initial draft was posted to the Python Linux SIG for discussion [10]_ and
1167
1167
then amended based on both that discussion and Victor Stinner's work in
1168
- :pep:`540` [11_] .
1168
+ :pep:`540` [11]_ .
1169
1169
1170
1170
The "ℙƴ☂ℌøἤ" string used in the Unicode handling examples throughout this PEP
1171
- is taken from Ned Batchelder's excellent "Pragmatic Unicode" presentation [9_] .
1171
+ is taken from Ned Batchelder's excellent "Pragmatic Unicode" presentation [9]_ .
1172
1172
1173
1173
Stephen Turnbull has long provided valuable insight into the text encoding
1174
1174
handling challenges he regularly encounters at the University of Tsukuba
@@ -1179,16 +1179,16 @@ References
1179
1179
==========
1180
1180
1181
1181
.. [1] CPython: sys.getfilesystemencoding() should default to utf-8
1182
- (http ://bugs.python.org/issue28180)
1182
+ (https ://bugs.python.org/issue28180)
1183
1183
1184
1184
.. [2] Locale configuration required for click applications under Python 3
1185
- (http ://click.pocoo.org/5 /python3/#python-3-surrogate-handling)
1185
+ (https ://click.palletsprojects.com/en/5.x /python3/#python-3-surrogate-handling)
1186
1186
1187
1187
.. [3] Fedora: force C.UTF-8 when Python 3 is run under the C locale
1188
1188
(https://bugzilla.redhat.com/show_bug.cgi?id=1404918)
1189
1189
1190
1190
.. [4] GNU C: How Programs Set the Locale
1191
- ( https://www.gnu.org/software/libc/manual/html_node/Setting-the-Locale.html)
1191
+ (https://www.gnu.org/software/libc/manual/html_node/Setting-the-Locale.html)
1192
1192
1193
1193
.. [5] GNU C: Locale Categories
1194
1194
(https://www.gnu.org/software/libc/manual/html_node/Locale-Categories.html)
@@ -1197,13 +1197,13 @@ References
1197
1197
(https://sourceware.org/glibc/wiki/Proposals/C.UTF-8)
1198
1198
1199
1199
.. [7] GNOME Flatpak
1200
- (http ://flatpak.org/)
1200
+ (https ://flatpak.org/)
1201
1201
1202
1202
.. [8] Ubuntu Snappy
1203
1203
(https://www.ubuntu.com/desktop/snappy)
1204
1204
1205
1205
.. [9] Pragmatic Unicode
1206
- (http ://nedbatchelder.com/text/unipain.html)
1206
+ (https ://nedbatchelder.com/text/unipain.html)
1207
1207
1208
1208
.. [10] linux-sig discussion of initial PEP draft
1209
1209
(https://mail.python.org/pipermail/linux-sig/2017-January/000014.html)
@@ -1224,10 +1224,10 @@ References
1224
1224
(https://bugs.python.org/issue19977)
1225
1225
1226
1226
.. [16] test_readline.test_nonascii fails on Android
1227
- (http ://bugs.python.org/issue28997)
1227
+ (https ://bugs.python.org/issue28997)
1228
1228
1229
1229
.. [17] UTF-8 locale discussion on "locale.getdefaultlocale() fails on Mac OS X with default language set to English"
1230
- (http ://bugs.python.org/issue18378#msg215215)
1230
+ (https ://bugs.python.org/issue18378#msg215215)
1231
1231
1232
1232
.. [18] GitHub branch diff for ``ncoghlan:pep538-coerce-c-locale``
1233
1233
(https://github.com/python/cpython/compare/master...ncoghlan:pep538-coerce-c-locale)
@@ -1250,13 +1250,3 @@ Copyright
1250
1250
1251
1251
This document has been placed in the public domain under the terms of the
1252
1252
CC0 1.0 license: https://creativecommons.org/publicdomain/zero/1.0/
1253
-
1254
-
1255
- ..
1256
- Local Variables:
1257
- mode: indented-text
1258
- indent-tabs-mode: nil
1259
- sentence-end-double-space: t
1260
- fill-column: 70
1261
- coding: utf-8
1262
- End:
0 commit comments