This repository was archived by the owner on Nov 5, 2022. It is now read-only.

Description
UnicodeReader can't actually detect UTF-32LE encodings. There's a big chain of if/else if/... blocks in the constructor that examine the first few bytes from an input stream. The blocks for detecting UTF-16LE and UTF-32LE are:
/* ... * /
else if ((bom[0] == (byte) 0xFF) && (bom[1] == (byte) 0xFE)) {
encoding = "UTF-16LE";
unread = n - 2;
}
/* ...code for UTF-32BE ... */
else if ((bom[0] == (byte) 0xFF) && (bom[1] == (byte) 0xFE)
&& (bom[2] == (byte) 0x00) && (bom[3] == (byte) 0x00)) {
encoding = "UTF-32LE";
unread = n - 4;
} else /* ... */
The condition for the UTF-32LE case:
(bom[0] == (byte) 0xFF) && (bom[1] == (byte) 0xFE)
&& (bom[2] == (byte) 0x00) && (bom[3] == (byte) 0x00)
can't be true unless the earlier case for UTF-16LE was also true:
(bom[0] == (byte) 0xFF) && (bom[1] == (byte) 0xFE)
So something that's UTF-32LE would be detected as UTF-16LE.