Interval tree for managing segment metadata in memory#19138
Interval tree for managing segment metadata in memory#19138pirvtech wants to merge 20 commits intoapache:masterfrom
Conversation
…g a given interval. Using it for finding segments loaded from cache.
…n finding lower and higher entries
|
Hi – thanks. Do you have benchmarks for this? |
I will add the benchmarks. |
kfaraz
left a comment
There was a problem hiding this comment.
Thanks for putting this together, @pirvtech!
I have often felt the need for such a datastructure too, even outside the VIT (VersionedIntervalTimeline).
Thoughts on perf
- I doubt if a single query is really going to benefit from this change since doing a
contains()or anoverlaps()check on say 25k intervals (which is a fairly large number of intervals for a typical Druid cluster) would not be very compute intensive.- You can think of 25k intervals as roughly 3 years worth of HOUR-granularity data.
- But in high concurrency, this would still be beneficial since the VIT does all of its computations inside a giant lock. The shorter we hold the lock, the better.
- Either way, we should add benchmarks as @jtuglu1 suggested for queries as well as VIT itself.
Notes on the implementation
I have left some inline comments.
Along with that, the approach would be much cleaner if you do something like this instead:
- Add an
IntervalNavigableMap<T>interface which extendsNavigableMap<Interval, T>. This interface should have the following new methods:entriesContaining(Interval interval)entriesOverlapping(Interval interval)entriesMatching(Interval interval)(Is this really needed?)- Alternatively, instead of methods that return a sub map, you could have methods that return a matching entry set.
- Instead of
HashMapandNavigableMap, the VIT class should use this new interface. - Add a class
IntervalTreeMap<T> implements IntervalNavigableMap<T> extends TreeMap<Interval, T>and provides default implementations for the new methods.- For example, for the
entriesContaining()method, we return the whole map.
- For example, for the
- Add a new class
FastSearchIntervalMap<T>which performs the optimised search. - Based on value of the
fastSearchflag passed to constructor of VIT, choose the implementation of the map. - This would ensure that there are minimal changes to
VITand we can easily swap out different map implementations.
| } | ||
| }; | ||
|
|
||
| private static final Comparator<Interval> INTERVAL_BY_START = new Comparator<>() |
There was a problem hiding this comment.
Can't we just reuse the existing comparators INTERVAL_BY_START_THEN_END and INTERVAL_BY_END_THEN_START?
| private static IntervalTreeMatchMode intervalTreeMatchMode = IntervalTreeMatchMode.NONE; | ||
|
|
||
| static { | ||
| String mode = System.getProperty("experimental.timeline.intervalTreeMatchMode"); |
There was a problem hiding this comment.
- Please don't use system properties here directly. Instead pass a flag into the constructor of
VersionedIntervalTimelineindicating whether the improved datastructures should be used or not. - Property name should be more like
druid.segment.timeline.fastIntervalSearchwith possible valuestrueandfalse. (I don't think we need 3 modes). - The class creating the timeline should pass in the correct value for the flag.
| } | ||
| } | ||
|
|
||
| { |
There was a problem hiding this comment.
Please move this field initialization inside the constructor.
| private enum IntervalTreeMatchMode | ||
| { | ||
| NONE, | ||
| ENTRIES_ONLY(Capability.ENTRIES), |
There was a problem hiding this comment.
Is there a specific case when we would want to support improved search on entries only and not queries?
I think just 2 modes should suffice: ALL or NONE
Segment metadata stored in memory of the Historicals, is used when looking up segments that match an interval for query and segment loading purposes. Currently this is a serial scan that goes through all segments metadata in ascending start time order to find the right matching segments.
This changes introduces an Interval Tree as a more efficient way to store segment metadata in memory, to speed up searches for segments, that can potentially cut down search times from O(n) to O(logn).
Core changes to file processing/src/main/java/org/apache/druid/timeline/VersionedIntervalTimeline.java
Interval tree implementation in file processing/src/main/java/org/apache/druid/timeline/IntervalTree.java
Documentation comments have been included in important files and sections of code. Unit tests added.
This has been reviewed internally and is in a production Druid cluster.