Skip to content

Commit a67ca97

Browse files
committed
Explain why particular conversions of characters are performed
Because the reasons are subtle and non-obvious.
1 parent a4ba88b commit a67ca97

File tree

1 file changed

+7
-0
lines changed

1 file changed

+7
-0
lines changed

src/util/string_utils.cpp

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -156,6 +156,13 @@ std::string escape_non_alnum(const std::string &to_escape)
156156
std::ostringstream escaped;
157157
for(auto &ch : to_escape)
158158
{
159+
// `ch` may have a negative value in the case of utf-8 encodings of
160+
// characters above unicode code point 127. The following line maps these
161+
// negative values to positive values in the 128-255 range, using a
162+
// `static_cast`. This is neccessary in order to avoid undefined behaviour
163+
// in `isalnum`. The positive values are then stored in an integer using a
164+
// widening initialisation so that the stream insertion operator prints them
165+
// as numbers rather than characters.
159166
const int uch{static_cast<unsigned char>(ch)};
160167
if(ch == '_')
161168
escaped << "__";

0 commit comments

Comments
 (0)