Conversation
Merging this PR will degrade performance by 87.02%
Performance Changes
Comparing Footnotes
|
|
I think you really want any conversion of vortex arrays to other arrays be an execution |
|
WDYM by execution? |
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
f876920 to
88cc0e1
Compare
|
Implementation of an Executable trait or something similar so that we can customise it dependeing on the context. For example cuda arrow export is different than host arrow export |
|
I think there has to be some mix, you must shift some stuff into the extensions/encodings themselves |
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
Polar Signals Profiling ResultsLatest Run
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals ProfilingVortex (geomean): 0.984x ➖ datafusion / vortex-file-compressed (0.984x ➖, 3↑ 3↓)
|
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.996x ➖, 0↑ 0↓)
datafusion / vortex-compact (0.986x ➖, 1↑ 0↓)
datafusion / parquet (0.968x ➖, 3↑ 0↓)
datafusion / arrow (0.992x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.999x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.989x ➖, 0↑ 0↓)
duckdb / parquet (0.973x ➖, 4↑ 2↓)
duckdb / duckdb (0.975x ➖, 1↑ 0↓)
Full attributed analysis
|
Benchmarks: FineWeb NVMeVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.990x ➖, 0↑ 1↓)
datafusion / vortex-compact (0.972x ➖, 1↑ 0↓)
datafusion / parquet (0.997x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.007x ➖, 1↑ 1↓)
duckdb / vortex-compact (1.000x ➖, 0↑ 0↓)
duckdb / parquet (1.002x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.018x ➖, 0↑ 4↓)
datafusion / vortex-compact (1.031x ➖, 0↑ 3↓)
datafusion / parquet (1.011x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.060x ➖, 0↑ 22↓)
duckdb / vortex-compact (1.030x ➖, 1↑ 3↓)
duckdb / parquet (1.035x ➖, 0↑ 7↓)
duckdb / duckdb (0.999x ➖, 1↑ 1↓)
Full attributed analysis
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.996x ➖, 0↑ 0↓)
datafusion / vortex-compact (0.976x ➖, 0↑ 0↓)
datafusion / parquet (0.995x ➖, 1↑ 0↓)
datafusion / arrow (0.980x ➖, 1↑ 1↓)
duckdb / vortex-file-compressed (0.922x ➖, 5↑ 0↓)
duckdb / vortex-compact (0.958x ➖, 0↑ 0↓)
duckdb / parquet (0.963x ➖, 1↑ 0↓)
duckdb / duckdb (0.971x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) duckdb / vortex-file-compressed (0.960x ➖, 1↑ 0↓)
duckdb / vortex-compact (0.980x ➖, 0↑ 0↓)
duckdb / parquet (0.978x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.891x ➖, 4↑ 0↓)
datafusion / vortex-compact (1.098x ➖, 0↑ 1↓)
datafusion / parquet (0.985x ➖, 1↑ 1↓)
duckdb / vortex-file-compressed (0.909x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.935x ➖, 1↑ 0↓)
duckdb / parquet (0.900x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: Clickbench on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.982x ➖, 5↑ 2↓)
datafusion / parquet (0.967x ➖, 4↑ 0↓)
duckdb / vortex-file-compressed (0.972x ➖, 5↑ 4↓)
duckdb / parquet (0.988x ➖, 1↑ 0↓)
duckdb / duckdb (1.046x ➖, 0↑ 5↓)
Full attributed analysis
|
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
|
The plan was to just have the vortex-arrow crate depend on all the code for the builtin arrow types, not sure there's anything else to change really? |
|
Summary
This is an attempt at playing around with making arrow-exporting both more extendable for Vortex extension types, support arrow's extension types and allow encodings to define their own preferred way to export to arrow (given some inputs).
My main motivation here is wanting to be able to export the upcoming
parquet-variantencoding into a canonical arrow extension type.API Changes
todo!()
Testing
todo!()