Description
To reproduce
When deciding on creating a schema for multiple tables that are very similar, it seems there are two key options in QuestDB.
Either create a separate table for each and name them accordingly:
table_1
- timestamp
- state
table_2
- timestamp
- state
Or create one table with an id symbol:
table
- id
- timestamp
- state
It was expected when trying to gather all the data, that the symbol table would most often produce faster execution times. But this seems not to be the case. In cases where the sample by is a small time frame, the asof join across all the separate tables seems to perform significantly better. And more so, as the tables and dataset grows so too does the disparity between execution time. Below is an minimum reproducible set to begin to see the difference. Adding more tables and data makes the difference worse. Also, attached is images of difference queries using the rest api to show the execution time differences. These were collected and printed via the /timings option in the rest api.
CREATE TABLE table_1 (
timestamp TIMESTAMP,
state LONG
) timestamp (timestamp) PARTITION BY DAY WAL;
INSERT INTO table_1
SELECT
timestamp_sequence('2024-01-01T00:00:00', 1000000L) timestamp,
rnd_long() state,
FROM long_sequence(86400) x;
CREATE TABLE table_2 (
timestamp TIMESTAMP,
state LONG
) timestamp (timestamp) PARTITION BY DAY WAL;
INSERT INTO table_2
SELECT
timestamp_sequence('2024-01-01T00:00:00', 1000000L) timestamp,
rnd_long() state,
FROM long_sequence(86400) x;
CREATE TABLE table_3 (
timestamp TIMESTAMP,
state LONG
) timestamp (timestamp) PARTITION BY DAY WAL;
INSERT INTO table_3
SELECT
timestamp_sequence('2024-01-01T00:00:00', 1000000L) timestamp,
rnd_long() state,
FROM long_sequence(86400) x;
CREATE TABLE table_4 (
timestamp TIMESTAMP,
state LONG
) timestamp (timestamp) PARTITION BY DAY WAL;
INSERT INTO table_4
SELECT
timestamp_sequence('2024-01-01T00:00:00', 1000000L) timestamp,
rnd_long() state,
FROM long_sequence(86400) x;
CREATE TABLE table_5 (
timestamp TIMESTAMP,
state LONG
) timestamp (timestamp) PARTITION BY DAY WAL;
INSERT INTO table_5
SELECT
timestamp_sequence('2024-01-01T00:00:00', 1000000L) timestamp,
rnd_long() state,
FROM long_sequence(86400) x;
CREATE TABLE table_6 (
timestamp TIMESTAMP,
state LONG
) timestamp (timestamp) PARTITION BY DAY WAL;
INSERT INTO table_6
SELECT
timestamp_sequence('2024-01-01T00:00:00', 1000000L) timestamp,
rnd_long() state,
FROM long_sequence(86400) x;
CREATE TABLE table_7 (
timestamp TIMESTAMP,
state LONG
) timestamp (timestamp) PARTITION BY DAY WAL;
INSERT INTO table_7
SELECT
timestamp_sequence('2024-01-01T00:00:00', 1000000L) timestamp,
rnd_long() state,
FROM long_sequence(86400) x;
CREATE TABLE table_8 (
timestamp TIMESTAMP,
state LONG
) timestamp (timestamp) PARTITION BY DAY WAL;
INSERT INTO table_8
SELECT
timestamp_sequence('2024-01-01T00:00:00', 1000000L) timestamp,
rnd_long() state,
FROM long_sequence(86400) x;
CREATE TABLE table_9 (
timestamp TIMESTAMP,
state LONG
) timestamp (timestamp) PARTITION BY DAY WAL;
INSERT INTO table_9
SELECT
timestamp_sequence('2024-01-01T00:00:00', 1000000L) timestamp,
rnd_long() state,
FROM long_sequence(86400) x;
CREATE TABLE table_10 (
timestamp TIMESTAMP,
state LONG
) timestamp (timestamp) PARTITION BY DAY WAL;
INSERT INTO table_10
SELECT
timestamp_sequence('2024-01-01T00:00:00', 1000000L) timestamp,
rnd_long() state,
FROM long_sequence(86400) x;
CREATE TABLE symbol_table (
timestamp TIMESTAMP,
id SYMBOL capacity 256 CACHE,
state LONG
) timestamp (timestamp) PARTITION BY DAY WAL;
INSERT INTO symbol_table
SELECT timestamp, CAST(1 AS SYMBOL), state FROM table_1;
INSERT INTO symbol_table
SELECT timestamp, CAST(2 AS SYMBOL), state FROM table_2;
INSERT INTO symbol_table
SELECT timestamp, CAST(3 AS SYMBOL), state FROM table_3;
INSERT INTO symbol_table
SELECT timestamp, CAST(4 AS SYMBOL), state FROM table_4;
INSERT INTO symbol_table
SELECT timestamp, CAST(5 AS SYMBOL), state FROM table_5;
INSERT INTO symbol_table
SELECT timestamp, CAST(6 AS SYMBOL), state FROM table_6;
INSERT INTO symbol_table
SELECT timestamp, CAST(7 AS SYMBOL), state FROM table_7;
INSERT INTO symbol_table
SELECT timestamp, CAST(8 AS SYMBOL), state FROM table_8;
INSERT INTO symbol_table
SELECT timestamp, CAST(9 AS SYMBOL), state FROM table_9;
INSERT INTO symbol_table
SELECT timestamp, CAST(10 AS SYMBOL), state FROM table_10;
QuestDB version:
8.1.0
OS, in case of Docker specify Docker and the Host OS:
Ubuntu 22.04.4 LTS Docker
File System, in case of Docker specify Host File System:
ext4
Full Name:
Nick Guerra
Affiliation:
Kronus Engineering
Have you followed Linux, MacOs kernel configuration steps to increase Maximum open files and Maximum virtual memory areas limit?
- Yes, I have
Additional context


