ArrowError("incompatible arrow schema, expected struct got List(Field { name: \"col_15\", data_type: Struct([Field { name: \"col_16\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"col_17\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"col_18\", data_type: Struct([Field { name: \"col_19\", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"col_20\", data_type: Int32, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} })")
Describe the bug
When reading a file that was created with older parquet writer (parquet-mr specificlly) and passing a schema that got from
ArrowReaderMetadatafails with:To Reproduce
I've added the file in:
1 liner
Run this in datafusion-cli
Only the relevant parts
This is the reproduction when taking from
datafusiononly the relevant parts that got to that errorCargo.toml:main.rs:Expected behavior
Should not fail
Additional context
this might be a bug in DataFusion rather than parquet reader here due to backward compatibility the schema was updated to the new version:
I've added the file in: