Skip to content

Convert AVG(col) to SUM(x) / COUNT(*) #19637

@Dandandan

Description

@Dandandan

Is your feature request related to a problem or challenge?

In queries that contain sum and/or avg (on the same columns) and/or count(), like TPCH query 1 we can convert AVG(col) to SUM(col) / COUNT() to avoid redundant count calculations for higher efficiency / reduced memory usage.

Describe the solution you'd like

Zooming in on the avg:

    avg(l_quantity) as avg_qty,
    avg(l_extendedprice) as avg_price,
    avg(l_discount) as avg_disc,
    count(*) as count_order

Those accumulators will all (redundantly) compute the count.

We should be able to rewrite it to use a shared count (and thus faster accumulators / less memory):

    sum(l_quantity) / count_order as avg_qty,
    sum(l_extendedprice) / count_order as avg_price,
    sum(l_discount) / count_order as avg_disc,
    count_order
FROM (...sub expr...)

Next, sum(l_quantity) and sum(l_extendedprice) are already computed, so the CSE ruke can simplify it to only create/calculate them once!

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions