-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Is your feature request related to a problem or challenge?
In queries that contain sum and/or avg (on the same columns) and/or count(), like TPCH query 1 we can convert AVG(col) to SUM(col) / COUNT() to avoid redundant count calculations for higher efficiency / reduced memory usage.
Describe the solution you'd like
Zooming in on the avg:
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(*) as count_order
Those accumulators will all (redundantly) compute the count.
We should be able to rewrite it to use a shared count (and thus faster accumulators / less memory):
sum(l_quantity) / count_order as avg_qty,
sum(l_extendedprice) / count_order as avg_price,
sum(l_discount) / count_order as avg_disc,
count_order
FROM (...sub expr...)
Next, sum(l_quantity) and sum(l_extendedprice) are already computed, so the CSE ruke can simplify it to only create/calculate them once!
Describe alternatives you've considered
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request