-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Benchmark: Add micro-benchmark for Nested Loop Join operator #16819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice start, I was thinking to add a benchmark that runs identical queries using the execution operators for different join algorithms. So this lets us compare NestedLoopJoin
performance to others like HJ or SMJ
Recorded existence join work at #16820 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@2010YOUY01 Thanks for providing such a comprehensive set of benchmark cases. It would be even better if it could also output the memory consumption for each SQL query, just like in this script.
benchmarks/src/nlj.rs
Outdated
let query_index = match query_name { | ||
"q1" => 0, | ||
"q2" => 1, | ||
"q3" => 2, | ||
"q4" => 3, | ||
"q5" => 4, | ||
"q6" => 5, | ||
"q7" => 6, | ||
"q8" => 7, | ||
"q9" => 8, | ||
"q10" => 9, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can rewrite it as follows to avoid this match
and available_queries
.
datafusion/benchmarks/src/imdb/run.rs
Lines 286 to 292 in 9ae41b1
let query_range = match self.query { | |
Some(query_id) => query_id..=query_id, | |
None => IMDB_QUERY_START_ID..=IMDB_QUERY_END_ID, | |
}; | |
let mut benchmark_run = BenchmarkRun::new(); | |
for query_id in query_range { |
let query_range = match self.query {
Some(query_id) => query_id..=query_id,
None => 1..=NLJ_QUERIES.len(),
};
for query_id in query_range {
// ...
let sql = NLJ_QUERIES[query_id-1];
// ...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in a3f5d05
benchmarks/src/nlj.rs
Outdated
Err(e) => { | ||
eprintln!("Query {query_name} failed: {e}"); | ||
benchmark_run.write_iter(std::time::Duration::from_secs(0), 0); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we return Err(e)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in a3f5d05
let physical_plan = df.create_physical_plan().await?; | ||
let plan_string = format!("{physical_plan:#?}"); | ||
|
||
if !plan_string.contains("NestedLoopJoinExec") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍👍
benchmarks/src/nlj.rs
Outdated
let start = Instant::now(); | ||
let df = ctx.sql(sql).await?; | ||
let batches = df.collect().await?; | ||
let elapsed = start.elapsed(); //.as_secs_f64() * 1000.0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's meaning of //.as_secs_f64() * 1000.0;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed in a3f5d05 to avoid confusion.
I tried to do internal memory profiling with rust, it's a bit tricky. Perhaps integrating a external script is easier. @ding-young is currently working on it. |
Thank you for the review @UBarney @jonathanc-n |
benchmarks/src/nlj.rs
Outdated
// return Err(exec_datafusion_err!( | ||
// "Query {} not found. Available queries: 1 to {}", | ||
// query_id, | ||
// NLJ_QUERIES.len() | ||
// )); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can remove this ?
Which issue does this PR close?
Rationale for this change
Now, NLJ operator still has some room to improve performance and efficiency (less memory consumption), and it has attracted interest from the community (cc @jonathanc-n ) recently.
Inspired by the benchmarks used by @UBarney in #16443 (comment), this PR added a similar micro-benchmark for NLJ into the DF benchmark suite.
What changes are included in this PR?
A new micro-benchmark for NLJ in the benchmark suite (
./bench.sh ...
)The queries and the varied query characteristics can be found in the src.
The special (semi/anti/mark) joins are not included, I'm not sure what's the typical workload for those joins.
The bench runner has a validation step to ensure the queries are using NLJ in physical plan.
Also, the optimizer currently does not reorder joins, so the execution order follows the join order in the SQL string. (I wish there were an option to explicitly enforce this behavior.)
Are these changes tested?
I tested it locally:
Bench Run
Are there any user-facing changes?