CSV: Support for Master/Detail - variant repeating column formats #212

DALDEI · 2020-08-02T13:38:52Z

Common in CRM systems is exporting as 'CSV' master/detail data like Invoices.

An example: (anonymized)

HEADERST|A-SYSTEM|04/28/20|200557  |DAILY |  
CUSTOMER|111111111|3041|05/28/20|05/28/20||05/28/20|US|PRINT |||STANDARD  |MAILTO ID NO. |  
FROMADDR|PROVO/4H |4H |835 WEST SAN JOSE ST |OGDEN, UT 84401 ||(800)253-0277|  
BILLADDR|ABC CO |1222 EAST 111 NORTH |SALT LAKE, UT 84040 ||||US |  
REMTADDR|SENDTO |4H |LB 11111 |PO BOX 22222 |MYTOWN, WA 99999-5143 |  
SHIPADDR|23497357|ABC CO  CUSTOMER# 11111|698 N. PLAINS ||PROVO |UT|11111 |SALT LAKE CITY/4H |  
STMTDTLS|04/22/20|1111111 |INV|222.13|0.00|0.00|222.13|05/29/20|NET 7 DAYS  |  
STMTDTLS|04/25/20|2222222 |INV|333.21|0.00|0.00|333.21|06/01/20|NET 7 DAYS  |  
STMTDTLS|04/26/20|3333333 |INV|383.22|0.00|0.00|383.22|06/02/20|NET 7 DAYS  |  
STMTDTLS|04/26/20|4445444 |INV|1799.95|0.00|0.00|1799.95|06/02/20|NET 7 DAYS  |  
STMTDTLS|04/27/20|5555555 |INV|22.56|0.00|0.00|22.56|06/03/20|NET 7 DAYS  |  
STMTDTLS|04/28/20|5555555 |INV|44.18|0.00|0.00|55.18|06/04/20|NET 7 DAYS  |  
STMTTOTL|4444.25|0.00|0.00|0.00|0.00|4444.25|  
STMTMSGS|**For customer inquiries, call 1-800-222-3377. Option 1, Option 2** ||

The identifying feature is a column (usually col 1) which indicates the 'type' of that record.
All records of the same 'type' have the same structure/schema.
The above snippet repeats, the outermost repeating block represents 1 logical 'row'

Short of modeling the actual nested structure, being able to specify alternate 'schemas'
identified by a column value, and a different class/pojo -- read sequentially.
These could be a hierarchy that made it easier to integrate into the JSON data model.
e.g. all derived from the same base class with only the 1 shared field (col1).
That should map well to polymorphic serialization with a type field.

What I was thinking of doing with this is 'forking' the input stream and choosing the schema on a row by row basis -- but to do that requires parsing the stream twice.

An alternative -- maybe this is possible now -- is to have a 2 step deserializer (deserializer?)
The first step reading into List, then by selecting list[0] using a seperate schema.
The problem is getting CSV parser to take a List instead of a InputStream as its input.

The text was updated successfully, but these errors were encountered:

jdimeo · 2020-08-13T20:17:36Z

Have you looked into @JsonSubTypes? Scroll down to section 5:
https://www.baeldung.com/jackson-annotations

jdimeo · 2020-08-13T20:19:03Z

This is also related to other recent ticket #202 which has useful commentary from the author himself about polymorphism with CSV

kdebski85 · 2021-05-18T09:22:01Z

I encountered the same issue.
JsonSubTypes does not work for CSV.
Jackson determines order of properties for the base class and throws "Unrecognized column 'type': known columns:" when trying to serialize any subclass with additional columns.
Jackson should determine order of properties for each subclass.

loverzpark · 2024-10-22T08:16:53Z

Is there any resolution for this? This is a very commonly used format to import/export and work with.

cowtowncoder · 2024-10-27T17:55:03Z

@loverzpark No comments, updates here -> no progress.

I agree with @jdimeo that this should work via @JsonSubTypes (and @JsonTypeInfo), to use/support standard Jackson polymorphic handling mechanisms. But not quite sure how to go about it at implementation level.

DALDEI changed the title ~~Support for Master/Detail - variant repeating column formats~~ CSV: Support for Master/Detail - variant repeating column formats Aug 2, 2020

cowtowncoder added csv to-evaluate Issue that has been received but not yet evaluated labels Aug 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CSV: Support for Master/Detail - variant repeating column formats #212

CSV: Support for Master/Detail - variant repeating column formats #212

DALDEI commented Aug 2, 2020

jdimeo commented Aug 13, 2020

jdimeo commented Aug 13, 2020

kdebski85 commented May 18, 2021

loverzpark commented Oct 22, 2024

cowtowncoder commented Oct 27, 2024

CSV: Support for Master/Detail - variant repeating column formats #212

CSV: Support for Master/Detail - variant repeating column formats #212

Comments

DALDEI commented Aug 2, 2020

jdimeo commented Aug 13, 2020

jdimeo commented Aug 13, 2020

kdebski85 commented May 18, 2021

loverzpark commented Oct 22, 2024

cowtowncoder commented Oct 27, 2024