Skip to content

Introduce filters for MemCS (C API) #5311

@TarantoolBot

Description

@TarantoolBot

Product: Tarantool EE
Since: 3.5

Now one can pre-calculate aggregates for MemCS primary index and use
them for data-skipping when reading data with Arrow Stream. The only
supported aggregate is minmax - data-skipping based on minimal and
maximal values of each block.

Size of each data-skipping block is 8192 rows. Insertions into the middle
of index are not yet implemented if then index has aggregates. On
deletion from the middle, the row is simply removed from the block, no
rebalancing happens; the block will be deleted when it becomes empty.

Note that aggregates themselves don't support deletions, so when a row is
deleted or updated, it will be still accounted in aggregates.

One can declare aggregates when creating an index:

local s = box.schema.create_space('test', {
    engine = 'memcs', field_count = 3,
    format = {{'a', 'uint64'}, {'b', 'uint64'}, {'c', 'uint64'}},
})
s:create_index('pk', {aggregates = {
    {type = 'minmax', field = 2, name = 'minmax_2'},
    {type = 'minmax', field = 3, name = 'minmax_3'},
}})

Note that declaration of aggregates for non-empty space is not supported.
Option aggregates is an array of maps, each map must have keys type
determining which type of aggregate it is and key name - a unique
aggregate name.

The only supported aggregate type is minmax and it requires one
additional key field - a field to aggregate minimal and maxumal values
over.

Then, one can use the new option filter of box_index_arrow_stream:

/* Create arrow stream options. */
box_arrow_options_t *options = box_arrow_options_new();

/*
 * Set filter `[2] > 42` so some rows with `[2] <= 42` can be skipped.
 */
box_filter_t filter;
filter->type = FILTER_TYPE_GE;
filter->field_no = 1; /* 0-indexation. */

char buf[16];
mp_encode_uint(buf, 42);
filter->value = buf;

box_arrow_options_set_filter(options, &filter);

/* Create stream. */
struct ArrowArrayStream stream;
int rc = box_index_arrow_stream(space_id, index_id, field_count, fields,
				key, key + key_size, options, &stream);

This arrow stream will read only those blocks which have at least one row
with row[2] > 42 values. On other words, we will try to skip all rows
with row[2] <= 42. The filtration will be based on minmax.

If there wouldn't be minmax over the second column, the filter would be
simply ignored - no error would be raised and no data would be skipped.

Note that data-skipping only can skip some blocks for the sake of query
optimization, so rows that don't satisfy the filter can be returned
(and, more likely, they will be returned) - user still need to filter
them manually.

Regarding filter lifetime, it's not copied under the hood, hence, it
shouldn't be deallocated until the stream is released.
Requested by @drewdzzz in https://github.com/tarantool/tarantool-ee/commit/7d2c5f1c6ea13b503273f63e5d576dcbf08c088f.

Metadata

Metadata

Assignees

No one assigned

    Labels

    3.5EEEE functionalityc api[Area] Related to the Tarantool C API referencememcs

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions