-
Notifications
You must be signed in to change notification settings - Fork 29
MLE-18060 : Adding the new commons-csv library and fixing the CSVparser initialization #517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
…er initialization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR integrates the newer Apache Commons CSV library and updates how CSVParser is initialized and used to correctly handle byte offsets and use the builder API.
- Switched from deprecated
CSVParser
constructor toCSVParser.builder()
and enabled byte tracking. - Updated record position checks from
getCharacterByte()
togetBytePosition()
. - Added Commons CSV to the build and distribution configurations; bumped Commons IO version.
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
File | Description |
---|---|
src/main/java/com/marklogic/contentpump/SplitDelimitedTextReader.java | Replaced deprecated parser instantiation and position API; updated exception message |
src/main/java/com/marklogic/contentpump/DelimitedTextReader.java | Capitalized exception message literal for consistency |
src/assemble/bindist.xml | Included org.apache.commons:commons-csv in the distribution bundle |
pom.xml | Removed custom version property, added direct commons-csv dependency, bumped commons-io version |
Comments suppressed due to low confidence (2)
pom.xml:24
- [nitpick] Hardcoding the Commons CSV version later in the POM duplicates version information. It may be clearer to reintroduce a
<commonsCsvVersion>
property to centralize version management.
<!-- <commonsCsvVersion>1.5.2-marklogic</commonsCsvVersion> removed -->
src/main/java/com/marklogic/contentpump/SplitDelimitedTextReader.java:195
- New parser builder logic with
trackBytes(true)
introduces byte-offset behavior; consider adding or updating unit tests to verify correct split boundary handling and byte tracking.
parser = CSVParser.builder()
src/main/java/com/marklogic/contentpump/SplitDelimitedTextReader.java
Outdated
Show resolved
Hide resolved
src/main/java/com/marklogic/contentpump/DelimitedTextReader.java
Outdated
Show resolved
Hide resolved
I have run 06mlcp in the local VM and it works fine. ( except 6 expected local failures) |
No description provided.