Skip to content

Manage group values and states by blocks in aggregation #11931

Open
@Rachelint

Description

@Rachelint

Is your feature request related to a problem or challenge?

Now we manage the group values and the aggregation states by a single big vector growing constantly.
This solution is simple to impl, but really leads to some extra cpu cost according to the cpu profile.
Maybe we should manage them by blocks like duckdb.

Describe the solution you'd like

It may be a big work, I want to finish it through following steps:

  • Sketch the total procedure.
  • Impl the block based group values management in GroupValuesRows.
  • Impl the block based group values management in other GroupValues impls.
  • Impl the block based states management in different GroupAccumulator impls.

The general design is similar as #7065 , but introduce it into GroupValues, not only GroupAccumulators.

Describe alternatives you've considered

No response

Additional context

The cpu cost flamegraph:
https://github.com/Rachelint/drawio-store/blob/main/cpucosts0811.png

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions