-
Notifications
You must be signed in to change notification settings - Fork 2k
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
In a substrait plan containing correlated subqueries, references to outer query fields are parsed incorrectly: we only look at the schema for the inner/subquery schema, which produced incorrect column names and types.
To Reproduce
For example, consider TPC-H Q21, which is parsed as:
Projection: SUPPLIER.S_NAME, count(Int64(1)) AS NUMWAIT
Limit: skip=0, fetch=100
Sort: count(Int64(1)) DESC NULLS FIRST, SUPPLIER.S_NAME ASC NULLS LAST
Aggregate: groupBy=[[SUPPLIER.S_NAME]], aggr=[[count(Int64(1))]]
Projection: SUPPLIER.S_NAME
Filter: SUPPLIER.S_SUPPKEY = LINEITEM.L_SUPPKEY AND ORDERS.O_ORDERKEY = LINEITEM.L_ORDERKEY AND ORDERS.O_ORDERSTATUS = Utf8("F") AND LINEITEM.L_RECEIPTDATE > LINEITEM.L_COMMITDATE AND EXISTS (<subquery>) AND NOT EXISTS (<subquery>) AND SUPPLIER.S_NATIONKEY = NATION.N_NATIONKEY AND NATION.N_NAME = U\
tf8("SAUDI ARABIA")
Subquery:
Filter: LINEITEM.L_ORDERKEY = LINEITEM.L_TAX AND LINEITEM.L_SUPPKEY != LINEITEM.L_LINESTATUS
TableScan: LINEITEM
Subquery:
Filter: LINEITEM.L_ORDERKEY = LINEITEM.L_TAX AND LINEITEM.L_SUPPKEY != LINEITEM.L_LINESTATUS AND LINEITEM.L_RECEIPTDATE > LINEITEM.L_COMMITDATE
TableScan: LINEITEM
Cross Join:
Cross Join:
Cross Join:
TableScan: SUPPLIER
TableScan: LINEITEM
TableScan: ORDERS
TableScan: NATION
Note that in the subquery, the filter has the clause LINEITEM.L_SUPPKEY != LINEITEM.L_LINESTATUS. This is not what Q21 contains; and in fact the types of those two columns (Int64 and Utf8) are not even compatible, although we don't currently reject this.
Expected behavior
Parse references to outer query fields correctly.
Additional context
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working