Skip to content

Commit 8d3e96c

Browse files
authored
maketables: avoid misleading values in case flipping table (#313)
The tables generated by pcre2_maketables() include one that maps all lowercase characters on the first 255 code points to their corresponding upper case code point, but fails to notice that toupper() could return a larger code point and therefore result in the store of a truncated and unrelated code instead. Restrict all values to what is valid for uint8_t and document in the test case the failure for character 'μ'[1] (U+00B5) and that was incorrectly getting back 924 (U+039C) from macOS fr_FR, and resulting in an incorrect case equivalent with the truncated value of 159. [1] https://en.wikipedia.org/wiki/Mu_(letter)
1 parent d11400f commit 8d3e96c

File tree

5 files changed

+31
-3
lines changed

5 files changed

+31
-3
lines changed

src/pcre2_maketables.c

+5-3
Original file line numberDiff line numberDiff line change
@@ -52,8 +52,6 @@ PCRE2_DFTABLES is defined. */
5252
# include "pcre2_internal.h"
5353
#endif
5454

55-
56-
5755
/*************************************************
5856
* Create PCRE2 character tables *
5957
*************************************************/
@@ -98,7 +96,11 @@ for (i = 0; i < 256; i++) *p++ = tolower(i);
9896

9997
/* Next the case-flipping table */
10098

101-
for (i = 0; i < 256; i++) *p++ = islower(i)? toupper(i) : tolower(i);
99+
for (i = 0; i < 256; i++)
100+
{
101+
int c = islower(i)? toupper(i) : tolower(i);
102+
*p++ = (c < 256)? c : i;
103+
}
102104

103105
/* Then the character class tables. Don't try to be clever and save effort on
104106
exclusive ones - in some locales things may be different.

testdata/testinput3

+5
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,11 @@
7575
\= Expect no match
7676
�cole
7777

78+
/\xb5/i
79+
80+
\= Expect no match
81+
\x9c
82+
7883
/\W+/
7984
>>>\xaa<<<
8085
>>>\xba<<<

testdata/testoutput3

+7
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,13 @@ Subject length lower bound = 1
108108
�cole
109109
No match
110110

111+
/\xb5/i
112+
113+
0: �
114+
\= Expect no match
115+
\x9c
116+
No match
117+
111118
/\W+/
112119
>>>\xaa<<<
113120
0: >>>

testdata/testoutput3A

+7
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,13 @@ Subject length lower bound = 1
108108
�cole
109109
No match
110110

111+
/\xb5/i
112+
113+
0: �
114+
\= Expect no match
115+
\x9c
116+
No match
117+
111118
/\W+/
112119
>>>\xaa<<<
113120
0: >>>

testdata/testoutput3B

+7
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,13 @@ Subject length lower bound = 1
108108
�cole
109109
No match
110110

111+
/\xb5/i
112+
113+
0: �
114+
\= Expect no match
115+
\x9c
116+
No match
117+
111118
/\W+/
112119
>>>\xaa<<<
113120
0: >>>

0 commit comments

Comments
 (0)