This repository was archived by the owner on May 9, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 14
This repository was archived by the owner on May 9, 2024. It is now read-only.
Joins fail in heterogeneous mode #577
Copy link
Copy link
Open
Description
To reproduce the case when join fails when enable_heterogeneous=True
, run the following python script:
import pyhdk
config = pyhdk.buildConfig(enable_heterogeneous=True,
force_heterogeneous_distribution=False)
pyhdk.initLogger(log_severity="DEBUG2")
storage = pyhdk.storage.ArrowStorage(1)
data_mgr = pyhdk.storage.DataMgr(config)
data_mgr.registerDataProvider(storage)
calcite = pyhdk.sql.Calcite(storage, config)
executor = pyhdk.Executor(data_mgr, config)
table_1_name = "taxi"
table_2_name = "numbers"
# Assuming you are in hdk/examples/
storage.importCsvFile("../omniscidb/Tests/ArrowStorageDataFiles/taxi_sample_header.csv", table_1_name, pyhdk.storage.TableOptions(5))
storage.importCsvFile("../omniscidb/Tests/ArrowStorageDataFiles/numbers_header.csv", table_2_name, pyhdk.storage.TableOptions(2))
# Perfect hash table OneToOne
sql = f"SELECT * FROM {table_1_name} a JOIN {table_1_name} b ON a.trip_id = b.trip_id"
# sql = f"SELECT * FROM {table_1_name} a JOIN {table_2_name} b ON a.trip_id = b.col1"
ra = calcite.process(sql)
rel_alg_executor = pyhdk.sql.RelAlgExecutor(executor, storage, data_mgr, ra)
res = rel_alg_executor.execute().to_arrow()
A simple hash join on primary key that is done via a perfect hash table is crashing. The output of gdb
is not very informative with regards to the location, but seems to nullptr
related:
Thread 1 "python3" received signal SIGABRT, Aborted.
__pthread_kill_implementation (no_tid=0, signo=6, threadid=140737352709952) at ./nptl/pthread_kill.c:44
...
#8 0x00007fff5ce5f0ae in VMError::report_and_die(Thread*, unsigned int, unsigned char*, void*, void*) ()
from .../anaconda3/envs/omnisci-dev/lib/jvm/lib/server/libjvm.so
#9 0x00007fff5cd06d69 in JVM_handle_linux_signal ()
from .../anaconda3/envs/omnisci-dev/lib/jvm/lib/server/libjvm.so
#10 <signal handler called>
#11 0x00007ffff7e3e354 in ?? ()
#12 0x00007ffff7e3e2a0 in ?? ()
#13 0x0000555557cd67d0 in ?? ()
#14 0x0000555557be5060 in ?? ()
#15 0x0000000000000002 in ?? ()
#16 0x0000000000000001 in ?? ()
#17 0x0000000000000000 in ?? ()
The last lines DEBUG2
logs show (shortened):
2023-07-12T08:08:41.525587 1 1573099 0 0 NvidiaKernel.cpp:154 Generated GPU binary code size: 469152 bytes
2023-07-12T08:08:41.526317 1 1573099 0 0 Execute.cpp:2881 Launching 1 kernels for query on:
2023-07-12T08:08:41.526336 1 1573099 0 0 Execute.cpp:2883 0 &CPU.
2023-07-12T08:08:41.526623 2 1573099 0 0 Execute.cpp:3476 bool(ra_exe_unit.union_all)=false ra_exe_unit.input_descs=(InputDescriptor(table_id(1),nest_level(0)) InputDescriptor(table_id(1),nest_level(1))) ra_exe_unit.input_col_descs=(InputColDescriptor(table_id=1, nest_level=0, col_id=1000) InputColDescriptor(table_id=1, nest_level=0, col_id=1001) ... MANY COL IDS ...
ra_exe_unit.scan_limit=0 num_rows=((20 20)) frag_offsets=((0 0)) query_exe_context->query_buffers_->num_rows_=-1 query_exe_context->query_mem_desc_.getEntryCount()=1 device_id=0 outer_table_id=-1 scan_limit=-1 start_rowid=0 num_tables=2
The same error happens when we try to join on a different table (you can use the commented sql
).
Interestingly, sometimes in ipython notebook the kernel crashes with the following last log lines:
2023-07-12T08:20:16.399125 W 1577588 0 0 Backend.cpp:833 Failed to generate PTX: NVVM IR ParseError: generatePTX: invalid redefinition of function 'pi'
declare double @pi();
^
. Switching to CPU execution target.
2023-07-12T08:20:16.399529 F 1577588 0 0 RelAlgExecutor.cpp:433 Check failed: co.device_type == ExecutorDeviceType::GPU
Metadata
Metadata
Assignees
Labels
No labels