-
Notifications
You must be signed in to change notification settings - Fork 145
Push the runtime filter from HashJoin down to SeqScan. #724
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Looks interesting. And I have some questions to discuss.
|
980e2de
to
99eabb2
Compare
There are codes changed in MultiExecParallelHash, please add some parallel tests with runtime filter. |
got it. |
|
Thanks, I'll reproduce the issue and fix it. |
Thanks for your detailed explanation.
Make sense. When doing hashjoin, index scan or index only scan are often not used on probe node.
Exactly, and if there is any lock used to solve the problem may even lead bad performance. |
runtime_filter has been pushed down to t3 table seqscan, but 'explain analyze' doesn't print them out.
|
76a003a
to
98dac6d
Compare
Thanks for your test case. Based on these, I rewrote code to ensure that debug info are always displayed even when the number of filtered rows is zero. And add the test case into gp_runtime_filter.sql too. |
Thanks for your test case. I fix it in 98dac6d And add the test case into gp_runtime_filter.sql too. |
Hi @zhangyue-hashdata Asking mostly out of curiosity, I see here are quite a few reviewers here already :) |
Basically, you're correct. Because our goal is to filter out as much data as possible right at the point of data generation. However, this will lead to very complex evaluations, so we only made a simple estimation based on rows and work memory when creating the Bloom filter. |
8bad97a
to
15feff6
Compare
2ea19a0
to
e607900
Compare
e607900
to
642db2e
Compare
642db2e
to
39232f0
Compare
@zhangyue-hashdata have you tried benchmarks on the recent builds? The last time I ran TPC-H against cloudberry, 12/22 queries used legacy optimizer. I guess for TPC-DS it's even worse. Btw, can you guys share TPC-DS and TPC-H toolkit you're using to benchmark cloudberry? |
not yet recently, here is toolkit for TPC-DS, hope helpful to you :-) |
874f754
to
c80d847
Compare
c80d847
to
4f72968
Compare
4f72968
to
3828a4d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@zhangyue-hashdata squash commits into one commit . |
got it |
+----------+ AttrFilter +------+ ScanKey +---------+ | HashJoin | ------------> | Hash | ---------> | SeqScan | +----------+ +------+ +---------+ If "gp_enable_runtime_filter_pushdown" is on, three steps will be run: Step 1. In ExecInitHashJoin(), try to find the mapper between the var in hashclauses and the var in SeqScan. If found we will save the mapper in AttrFilter and push them to Hash node; Step 2. We will create the range/bloom filters in AttrFilter during building hash table, and these filters will be converted to the list of ScanKey and pushed down to Seqscan when the building finishes; Step 3. ScanKeys will be used to filter slot in Seqscan. TODO: 1. support singlenode or utility mode; 2. support null value filter; 3. support Motion, SharedScan as target node; 4. support join qual like: t1.c1 = t2.c1+5;
3828a4d
to
af9355b
Compare
Push the runtime filter from HashJoin down to SeqScan.
+----------+ AttrFilter +------+ ScanKey +---------+
| HashJoin | ------------> | Hash | ---------> | SeqScan |
+----------+ +------+ +---------+
If "gp_enable_runtime_filter_pushdown" is on, three steps will be run:
Step 1. In ExecInitHashJoin(), try to find the mapper between the var in
hashclauses and the var in SeqScan. If found we will save the mapper in
AttrFilter and push them to Hash node;
Step 2. We will create the range/bloom filters in AttrFilter during building
hash table, and these filters will be converted to the list of ScanKey
and pushed down to Seqscan when the building finishes;
Step 3. ScanKeys will be used to filter slot in Seqscan.
TODO:
perf:
CPU E5-2680 v2 10 cores, memory 32GB, 3 segments
tpcds 10s details
tpcds 100s details
Fixes #ISSUE_Number
What does this PR do?
Type of Change
Breaking Changes
Test Plan
make installcheck
make -C src/test installcheck-cbdb-parallel
Impact
Performance:
User-facing changes:
Dependencies:
Checklist
Additional Context
[skip ci]
to your PR title. Only use when necessary!