Skip to content
This repository was archived by the owner on May 9, 2024. It is now read-only.
This repository was archived by the owner on May 9, 2024. It is now read-only.

[Perf][Bench] Join is slow on big tables. #574

@Devjiu

Description

@Devjiu

On size of tables ~10^8 join performance is very low, compared with duckdb.

On join by single int column for 2 tables 10^8 size takes about 12 s.

12888ms total duration for executeRelAlgQuery
        12886ms start(1ms) executeWorkUnit RelAlgExecutor.cpp:1389
          569ms start(1ms) compileWorkUnit NativeCodegen.cpp:1403
          5394ms start(571ms) fetchChunks Execute.cpp:3090
          67ms start(5966ms) executePlan Execute.cpp:3462
          9ms start(6487ms) collectAllDeviceResults Execute.cpp:2550
          70ms start(6497ms) compileWorkUnit NativeCodegen.cpp:1403
          3071ms start(6568ms) fetchChunks Execute.cpp:3090
          527ms start(9640ms) executePlan Execute.cpp:3462

The biggest performance drop is in fetchChunks. There are also 2 subqueries, the first is COUNT(*) to collect metadata.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions