You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
XML document with the encoded Unicode characters from the Unicode Supplementary Private Use Area-B (like ) should be deserialized by XStream without any issues with these characters or any other valid characters, regardless of their location in the document.
Actual Behavior
XStream erroneously appends a replacement character (�) after the ampersand during deserializing if the XML document contains a character from the Unicode Supplementary Private Use Area-B somewhere before the ampersand in the XML.
The 􏰍 encoded character (, U+10FC0D, HEX: F4 8F B0 8D) should present somewhere in the XML document before the encoded ampersand (&).
Simple code example:
RootTag class:
@XStreamAlias("rootTag")
public class RootTag {
@XStreamAlias("text")
private TextTag text;
public TextTag getText() {
return text;
}
}
TextTag class:
@XStreamConverter(value = ToAttributedValueConverter.class, strings = {"value"})
@XStreamAlias("textTag")
public class TextTag {
private String value;
public String getValue() {
return value;
}
}
Test class with the simple XML input:
class XStreamTest {
@Test
void testXStreamFailsToParseAmpersandAfterSupplementaryCharacter() throws Exception {
String input = """
<?xml version="1.0" encoding="UTF-8"?>
<rootTag>
<text>Test: & ampersand before, supplementary character 􏰍, ampersand & after</text>
</rootTag>""";
XStream xStream = new XStream();
xStream.processAnnotations(RootTag.class);
xStream.addPermission(new ExplicitTypePermission(new Class[]{RootTag.class}));
try (InputStream is = new ByteArrayInputStream(input.getBytes(StandardCharsets.UTF_8))) {
RootTag rootTag = (RootTag) xStream.fromXML(is);
assertEquals("Test: & ampersand before, supplementary symbol \uDBFF\uDC0D, ampersand & after",
rootTag.getText().getValue());
}
}
}
Output:
Expected :Test: & ampersand before, supplementary character , ampersand & after
Actual :Test: & ampersand before, supplementary character , ampersand &� after
NOTE:
This issue may be related to the #336 (PrettyPrintWriter cannot write emoji in XML 1.1 mode).
The text was updated successfully, but these errors were encountered:
XStream does not actually parse XML at all, but uses an XML parser instead. You can select the parser on your own by setting the appropriate driver. Please open an issue for the MXParser, which is used in your example as the default.
Expected Behavior
XML document with the encoded Unicode characters from the Unicode Supplementary Private Use Area-B (like ) should be deserialized by XStream without any issues with these characters or any other valid characters, regardless of their location in the document.
Actual Behavior
XStream erroneously appends a replacement character (�) after the ampersand during deserializing if the XML document contains a character from the Unicode Supplementary Private Use Area-B somewhere before the ampersand in the XML.
Steps to reproduce
􏰍
encoded character (, U+10FC0D, HEX: F4 8F B0 8D) should present somewhere in the XML document before the encoded ampersand (&
).Simple code example:
RootTag
class:TextTag
class:NOTE:
This issue may be related to the #336 (PrettyPrintWriter cannot write emoji in XML 1.1 mode).
The text was updated successfully, but these errors were encountered: