Parquet: add row range constraints cache#7478
Parquet: add row range constraints cache#7478siddarth2810 wants to merge 8 commits intocortexproject:masterfrom
Conversation
Implement a parquet-common RowRangesForConstraintsCache backed by Cortex cache backends. Encode row ranges into a compact binary format and hash cache keys so they are safe for memcached and other shared cache backends. Signed-off-by: Siddarth Gundu <siddarthg0910@gmail.com>
Add bucket-store config for the parquet row ranges cache, create the backend cache for parquet queryable Signed-off-by: Siddarth Gundu <siddarthg0910@gmail.com>
Signed-off-by: Siddarth Gundu <siddarthg0910@gmail.com>
Signed-off-by: Siddarth Gundu <siddarthg0910@gmail.com>
friedrichg
left a comment
There was a problem hiding this comment.
Direction looks good to me. Clean implementation following existing cache patterns. Only minor nit: the cfg.MultiLevel.BackFillTTL = cfg.TTL line in RegisterFlagsWithPrefix is dead code since CreateParquetRowRangesCache re-sets it.
I see, will remove that dead code. Thank you so much :) |
Signed-off-by: Siddarth Gundu <siddarthg0910@gmail.com>
Signed-off-by: Siddarth Gundu <siddarthg0910@gmail.com>
friedrichg
left a comment
There was a problem hiding this comment.
Great work.
If I can ask for anything. I would ask for an integration test, maybe modifying TestQuerierWithBlocksStorageRunningInSingleBinaryMode
… querier Assert that the cache misses on the first query and hits when querying the same stored series again. Signed-off-by: Siddarth Gundu <siddarthg0910@gmail.com>
Done ! Thanks a lot :) |
|
weird, the test passed when I tested locally. Looking into it |
With WaitMissingMetrics, the e2e helper keeps retrying on missing cache hits/misses until timeout. Signed-off-by: Siddarth Gundu <siddarthg0910@gmail.com>
Apologies. It was mistake in my testing workflow |
friedrichg
left a comment
There was a problem hiding this comment.
Thanks for the integration test
What this PR does:
When Cortex filters parquet blocks during queries, it computes which row ranges match the query constraints. This PR adds caching for parquet row-range filtering.
The cache is wired into both Querier and Store Gateway parquet query paths.
Which issue(s) this PR fixes:
Fixes #7139
Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]docs/configuration/v1-guarantees.mdupdated if this PR introduces experimental flags