Skip to content

Shrinked QEB#241

Draft
gshigin wants to merge 121 commits into
ppfrom
shrinked_encoding_bimap
Draft

Shrinked QEB#241
gshigin wants to merge 121 commits into
ppfrom
shrinked_encoding_bimap

Conversation

@gshigin
Copy link
Copy Markdown
Collaborator

@gshigin gshigin commented Feb 19, 2026

No description provided.

@gshigin gshigin requested a review from cherep58 February 19, 2026 15:38
@gshigin gshigin self-assigned this Feb 19, 2026
Comment thread pp/bare_bones/bitset.h Outdated
Comment thread pp/bare_bones/bitset.h Outdated
Comment thread pp/bare_bones/bitset.h Outdated
Comment thread pp/bare_bones/bitset.h
Comment thread pp/bare_bones/tests/bitset_tests.cpp Outdated
Comment thread pp/bare_bones/tests/bitset_tests.cpp
Comment thread pp/bare_bones/snug_composite.h Outdated
Comment thread pp/bare_bones/snug_composite.h Outdated
Comment thread pp/bare_bones/snug_composite.h Outdated
Comment thread pp/wal/hashdex/go_head.h Outdated
Comment thread pp/series_index/queryable_encoding_bimap.h Outdated
Comment thread pp/series_index/queryable_encoding_bimap.h Outdated
Comment thread pp/series_index/queryable_encoding_bimap.h Outdated
Comment thread pp/series_index/queryable_encoding_bimap.h Outdated
Comment thread pp/series_index/queryable_encoding_bimap.h Outdated
Comment thread pp/series_index/sorting_index.h Outdated
const auto from_find = lss_.find(ls3);

// Assert
EXPECT_GE(new_id, shrink_boundary);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of EXPECT/ASSERT_GE/LT use EXPECT/ASSERT with concrete value

Lss loaded;
load_lss_with_single_series(ls, loaded, target_id);
const bool was_active_before = target_id < loaded.added_series().size() && loaded.added_series()[target_id];
EXPECT_FALSE(was_active_before);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EXPECT in arrange section.

And what this test testing?

EXPECT_EQ(2U, lss_copy.series_count());
}

TEST_F(BimapCopierFixture, CopyFindsCopiedSeries) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CopyFindsCopiedSeries and CopyKeepsSeriesCount can be merged into single test

copier2.copy_added_series_and_build_indexes();

// Act
[[maybe_unused]] const auto ls_id = lss_copy_of_copy.find_or_emplace(label_set3);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[[maybe_unused]] const auto ls_id = lss_copy_of_copy.find_or_emplace(label_set3);
std::ignore = lss_copy_of_copy.find_or_emplace(label_set3);

EXPECT_EQ(ls3_, lss_[2]);
}

TEST_F(BimapShrinkFixture, ShrunkStateSeriesCountMatchesStorage) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test can be merged with FinalizeShrinkMapsSeriesInOrder

EXPECT_EQ(3U, lss_.max_item_index());
}

TEST_F(BimapShrinkFixture, IndexWriteContextDedupesSymbolsAfterFullShrink) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IndexWriteContextDedupesSymbolsAfterFullShrink and IndexWriteContextResolvesRefsAfterFullShrink are test for index writer

Comment thread pp/series_index/prometheus/tsdb/index/index_write_context.h Outdated
public:
#pragma pack(push, 1)
struct ExportSymbolId {
SymbolSource source{SymbolSource::kCurrent};
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most likely, name_id will not be a big value and you can use the highest bit for source


template <class Callback>
void for_each_symbol(Callback&& callback) const {
for (uint32_t symbol_ref = 0; symbol_ref < symbols_.size(); ++symbol_ref) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that loop with pointer to symbols_ item or loop on iterators will be more efficient


void collect_current_symbols_from_shrunk_series(std::vector<SymbolIdWithView>& symbol_ids) const {
for (uint32_t ls_id = lss_.shrink_state().shift; ls_id < lss_.max_item_index(); ++ls_id) {
if (lss_.symbol_source_for_series(ls_id) != SymbolSource::kCurrent) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any added item after shrink will be with the type SymbolSource::kCurrent. It's redundant if


using SymbolReference = PromPP::Prometheus::tsdb::index::SymbolReference;
using SymbolReferencesMap = phmap::flat_hash_map<ExportSymbolId, SymbolReference, ExportSymbolIdHasher>;
using SymbolIdWithView = std::pair<std::string_view, ExportSymbolId>;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::string_view add for each symbol 16 bytes overhead. It may be expensive and should be benchmarked

symbols_.clear();
symbol_refs_.clear();

std::vector<SymbolIdWithView> symbol_ids;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

symbol_ids and methods collect_empty_symbol, collect_current_symbols, collect_snapshot_symbols can be extracted into separate class SymbolIdsCollector

QueryableEncodingBimap lss_;
SymbolReferencesMap symbol_references_;
LabelIndicesWriter<QueryableEncodingBimap, decltype(stream_)> label_indices_writer{lss_, symbol_references_, stream_writer_};
std::optional<series_index::prometheus::tsdb::index::IndexWriteContext<QueryableEncodingBimap>> index_write_context_;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::optional not good pattern for unit tests. Instead of std::optional create variables in the place where they are used: TEST_P(LabelIndicesWriterFixture, Test).
Or you can manually call index_write_context_.rebuild() after lss filling

@@ -49,16 +54,18 @@ class LabelIndicesWriterFixture : public testing::TestWithParam<LabelIndicesWrit

std::ostringstream stream;
StreamWriter<decltype(stream_)> stream_writer{&stream};
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use stream_writer_

QueryableEncodingBimap lss_;
SymbolReferencesMap symbol_references_;
SeriesReferencesMap series_references_;
std::optional<series_index::prometheus::tsdb::index::IndexWriteContext<QueryableEncodingBimap>> index_write_context_;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use index_write_context_.rebuild instead of std::optional

QueryableEncodingBimap lss_;
SymbolReferencesMap symbol_references_;
SeriesReferencesMap series_references_;
std::optional<series_index::prometheus::tsdb::index::IndexWriteContext<QueryableEncodingBimap>> index_write_context_;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use index_write_context_.rebuild instead of std::optional

shrunk_lss_.finalize_copy_and_shrink(*snapshot_copy_, dst_src_ids_mapping);
}

static std::string write_index(const Lss& lss) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You remove IndexWriter::write method, but implement it here. Better restore IndexWriter::write and use it with std::stringstream

}
}

std::shared_ptr<Lss> load_lss_from_file() {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need std::unique_ptr?

return {};
}

void assert_added_series_suffix_marked(const Lss& lss, uint32_t begin_id) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's benchmark, not unit test. We should not validate data

Comment thread pp/wal/decoder.h
decoder_.process_segment(processor);
decoder_.process_segment([this, &processor](Primitives::LabelSetID ls_id, Primitives::Timestamp timestamp, double value) PROMPP_LAMBDA_INLINE {
if constexpr (requires(LSS& label_set, uint32_t id) { label_set.mark_active(id); }) {
label_set_.mark_active(ls_id);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we should do this in decoder?

Comment thread pp/wal/decoder.h
decoder_.process_segment([&last_ls_id, &samples, &container](uint32_t ls_id, int64_t ts, double v) PROMPP_LAMBDA_INLINE {
decoder_.process_segment([this, &last_ls_id, &samples, &container](uint32_t ls_id, int64_t ts, double v) PROMPP_LAMBDA_INLINE {
if constexpr (requires(LSS& label_set, uint32_t id) { label_set.mark_active(id); }) {
label_set_.mark_active(ls_id);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we should do this in GenericDecoder?

Comment thread pp/wal/wal.h Outdated
@@ -0,0 +1,120 @@
#include <benchmark/benchmark.h>
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All comments also apply to the queryable_encoding_bimap_resolve_benchmark.cpp

Comment thread pp/primitives/snug_composites_filaments.h Outdated
Comment thread pp/primitives/snug_composites_filaments.h Outdated
Comment thread pp/entrypoint/head/lss.h
class ShrinkAwareSnapshotLSS : public SnapshotLSS {
public:
explicit ShrinkAwareSnapshotLSS(const QueryableEncodingBimap& lss)
: SnapshotLSS(lss), shrink_state_(lss.shrink_state().clone_for_snapshot()), added_series_(lss.added_series()) {}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inside ShrinkState we copy BareBones::Vector<uint32_t>. Use BareBones::SharedVector

Comment thread pp/entrypoint/head/lss.h Outdated
Comment thread pp/entrypoint/head_wal.cpp Outdated
Comment thread pp/go/cppbridge/primitives_lss_test.go
Comment thread pp/go/cppbridge/primitives_lss_test.go Outdated
Comment thread pp/go/cppbridge/primitives_lss_test.go Outdated
Comment thread pp/go/cppbridge/primitives_lss_test.go Outdated
}

// makeRotatedLSS creates a rotated LSS from the original LSS.
func (s *RotateLSSSuite) makeRotatedLSS(lName string) *rotatedLSS {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplify method. Fill fields of rotatedLSS manually, not by code

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but filling in the structure is a constructor.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but don't use loops, slice.Sort, if i%2 == 0 and other logic blocks. Only manually filling

u-veles-a and others added 6 commits May 5, 2026 14:26
Signed-off-by: Alexandr Yudin <57181751+u-veles-a@users.noreply.github.com>
Signed-off-by: Alexandr Yudin <57181751+u-veles-a@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants