Skip to content

[issue 768]FromXmlParser lacks extension point for custom XmlTokenStream #769

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 16, 2025

Conversation

xzxiaoshan
Copy link
Contributor

@xzxiaoshan xzxiaoshan commented Jul 15, 2025

#768

This PR adds an extension point to FromXmlParser to allow customization of the XmlTokenStream implementation. No logic changes are made — it’s purely an additive change to support extensibility.

If this can be merged and released in a new version soon, it would be greatly appreciated as it's needed for our project.

Thank you!


1、By abstracting the createParser method, developers no longer need to override four _createParser methods in XmlFactory when implementing their own custom FromXmlParser.
2、Abstract a createGenerator method to make it easier to extend.

This change is purely structural and does not involve any modification to the underlying logic of the code.

@xzxiaoshan
Copy link
Contributor Author

@cowtowncoder Hello, could you please take care of this pull request (PR)

@pjfanning
Copy link
Member

@xzxiaoshan could you add a test case or 2 to demo and test the new constructor? - it would help to show why it is needed

_xmlTokens = new XmlTokenStream(xmlReader, ctxt.contentReference(),
_formatFeatures, tagProcessor);

if (xmlTokenStream == null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use JDK Objects.requireNonNull() like

_xmlTokens = Objects.requireNonNull(xmlTokenStream, "xmlTokenStream cannot be null");

@cowtowncoder
Copy link
Member

Seems reasonable to me, happy to merge.

One process thing @xzxiaoshan: unless you have sent one earlier (if you have just LMK), we'd need CLA before merging the first contribution. It's from here:

https://github.com/FasterXML/jackson/blob/master/contributor-agreement.pdf

and the usual way is to print, fill & sign, scan/photo, email to cla at fasterxml dot com.
Once I have that I can proceed with merge.

@xzxiaoshan
Copy link
Contributor Author

@cowtowncoder

Contribution Agreement (CLA) – Signed and Attached

and

Since has been added and the code has been adjusted.

@cowtowncoder cowtowncoder added the cla-needed CLA is required (but not yet received) for PR label Jul 16, 2025
…override four _createParser methods in XmlFactory when implementing their own custom FromXmlParser.

This change is purely structural and does not involve any modification to the underlying logic of the code.
This change is purely structural and does not involve any modification to the underlying code logic.
@xzxiaoshan
Copy link
Contributor Author

@cowtowncoder
Hi, Hello, I've added some related changes. There's still no change to the core logic — it's just an enhancement to improve extensibility.

@xzxiaoshan
Copy link
Contributor Author

xzxiaoshan commented Jul 16, 2025

@xzxiaoshan could you add a test case or 2 to demo and test the new constructor? - it would help to show why it is needed

I tried to write test cases, but I found it difficult to construct the code for them. So I’ll explain the scenario here and leave a note about the use case in the pull request.


The XmlTokenStream is a class that handles XML parsing. My use case involves extending the _decodeAttributeName method to add a specific prefix — the @ symbol — to attribute names, so that they are stored as keys in the resulting map with this prefix. This allows a clear distinction between XML attributes and child node elements in the parsed data.

However, in the current framework, the instantiation of new XmlTokenStream is fixed within the constructor of FromXmlParser, making it difficult for developers to customize or extend XML parsing behaviors.

Here’s a code snippet from my use case:

public class MyFromXmlParser extends FromXmlParser {

    public MyFromXmlParser (IOContext ctxt, int genericParserFeatures, int xmlFeatures, ObjectCodec codec,
                              XMLStreamReader xmlReader, XmlNameProcessor tagProcessor) throws IOException {
        super(ctxt, genericParserFeatures, codec, new MyXmlTokenStream(xmlReader, ctxt.contentReference(), xmlFeatures, tagProcessor));
    }
}
// MyXmlTokenStream.java

@Override
protected void _decodeAttributeName(String namespaceURI, String localName) {
    localName = "@".concat(localName);
    super._decodeAttributeName(namespaceURI, localName);
}

With this override, I ensure that all XML attributes are marked with the @ prefix during parsing, which helps distinguish them from regular child elements in the resulting JSON structure.

My original intention was to achieve a seamless bidirectional conversion between XML, Map, and JSON, without being troubled by ambiguities caused by XML attributes.

@pjfanning

Moreover, this is not the only issue — as long as developers need to perform custom processing on XmlTokenStream, they are unable to extend or customize their own version of XmlTokenStream within the original code structure.

@cowtowncoder cowtowncoder added cla-received PR already covered by CLA (optional label) 2.20 and removed cla-needed CLA is required (but not yet received) for PR labels Jul 16, 2025
@cowtowncoder cowtowncoder merged commit 022bbc0 into FasterXML:2.x Jul 16, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.20 cla-received PR already covered by CLA (optional label)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants