-
Notifications
You must be signed in to change notification settings - Fork 13.8k
[FLINK-37973][table-planner] Add support for right joins to multijoin operator #26810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[FLINK-37973][table-planner] Add support for right joins to multijoin operator #26810
Conversation
@SteveStevenpoor can you provide a quick description of the tests/changes that are still pending? |
@@ -278,37 +278,6 @@ Calc(select=[user_id_0, name, order_id, age]) | |||
+- Calc(select=[user_id, age]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could only add the new tests but not modify the orders in the xml so we see that the changes do not break anything in existing tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could only add the new tests but not modify the orders in the xml so we see that the changes do not break anything in existing tests?
The file is automatically generated by mvn test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could have a more robust solution that doesn't rely on RULE_SEQUENCE. Some ideas:
- Why can't we just execute FlinkRightJoinToLeftJoinRule in a dedicated optimization phase (e.g., a FlinkHepProgram) that runs before the main logical optimization. The output of this phase is always plan that only contains INNER and LEFT joins. All subsequent optimization phases, including the one with JoinToMultiJoinRule, can then be written with the assumption that they will never encounter a RIGHT JOIN.
- Adjust FlinkRightJoinToLeftJoin to be FlinkRightJoinToLeftJoinInMultiJoin. That means, we could perform the swap inside the MultiJoin node and not before it's converted.
If you think your approach is better than those two, could you add a bit more of comments + example in the javadocs for FlinkOrderPreservingProjection, JoinToMultiJoinRule so it's a bit easier to follow the dependency between all three rules?
Let's start from the example: SELECT u.user_id_0, u.name, o.order_id, p.payment_id, s.location So the original ast will look like: After right join to left join convertion: In this MR I get rid of projections when converting to multi join (No INNER or LEFT joins but may be projections). So It will look like this: So what's the problem with tests? Why can't we construct MJ(O, U, P, S)? What I try to do: After: Also it does not matter if we change right joins to left ones before JoinToMultiJoinRule or after. We will need to add projection on top of multi join anyway and handle right-side joins in StreamingMultiJoinOperator. Since we already have FlinkRightToLeftJoinRule I think it's ok to assume it will be applied before JoinToMultiJoinRule. @gustavodemorais let me know your thoughts on it, my friend. |
TODO:
|
… join logic to support any join tree
7b7a071
to
0ceca08
Compare
What is the purpose of the change
The newly introduced
MultiJoin
operator currently does not support right joins. Since all right joins are converted to left joins, we need to enable theFlinkRightToLeftJoinRule
in the MultiJoin rule set and handle left joins using projections.Brief change log
Introduced
FlinkOrderPreservingProjection
, a wrapper around the original projection. This class is used to define projections created byFlinkRightToLeftJoinRule
Patched
JoinToMultiJoinRule
to supportFlinkOrderPreservingProjection
(i. e. right joins).Other minor adjustments related to
JoinToMultiJoinRule
.Verifying this change
MultiJoinTest
:testThreeWayRightInnerJoin
testThreeWayRightRightJoin
testThreeWayInnerRightJoin
TODO: Semantic and restore tests will be added once
StreamExecMultiJoin
is fixed to support right joins.Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: noDocumentation