|
| 1 | +// Copyright 2024 The LevelDB-Go and Pebble Authors. All rights reserved. Use |
| 2 | +// of this source code is governed by a BSD-style license that can be found in |
| 3 | +// the LICENSE file. |
| 4 | + |
| 5 | +// Package base defines fundamental types used across Pebble, including keys, |
| 6 | +// iterators, etc. |
| 7 | +// |
| 8 | +// # Iterators |
| 9 | +// |
| 10 | +// The [InternalIterator] interface defines the iterator interface implemented |
| 11 | +// by all iterators over point keys. Internal iterators are composed to form an |
| 12 | +// "iterator stack," resulting in a single internal iterator (see mergingIter in |
| 13 | +// the pebble package) that yields a merged view of the LSM. |
| 14 | +// |
| 15 | +// The SeekGE and SeekPrefixGE positioning methods take a set of flags |
| 16 | +// [SeekGEFlags] allowing the caller to provide additional context to iterator |
| 17 | +// implementations. The TrySeekUsingNext flag is set when the caller has |
| 18 | +// knowledge that no action has been performed to move this iterator beyond the |
| 19 | +// first key that would be found if this iterator were to honestly do the |
| 20 | +// intended seek. This allows a class of optimizations where an internal |
| 21 | +// iterator may avoid a full naive repositioning if the iterator is already |
| 22 | +// at a proximate position. This also means every caller (including intermediary |
| 23 | +// internal iterators within the iterator stack) must preserve this |
| 24 | +// relationship. |
| 25 | +// |
| 26 | +// For example, if a range deletion deletes the remainder of a prefix, the |
| 27 | +// merging iterator may be able to elide a SeekPrefixGE on level iterators |
| 28 | +// beneath the range deletion. However in doing so, a TrySeekUsingNext flag |
| 29 | +// passed by the merging iterator's client no longer transitively holds for |
| 30 | +// subsequent seeks of child level iterators in all cases. The merging iterator |
| 31 | +// assumes responsibility for ensuring that SeekPrefixGE is propagated to its |
| 32 | +// consitutent iterators only when valid. |
| 33 | +// |
| 34 | +// Description of TrySeekUsingNext mechanics across the iterator stack: |
| 35 | +// |
| 36 | +// As the top-level entry point of user seeks, the [pebble.Iterator] is |
| 37 | +// responsible for detecting when consecutive seeks move monotonically forward. |
| 38 | +// It saves seek keys and compares consecutive seek keys to decide whether to |
| 39 | +// propagate the TrySeekUsingNext flag to its [InternalIterator]. |
| 40 | +// |
| 41 | +// The [pebble.Iterator] also has its own TrySeekUsingNext optimization in |
| 42 | +// SeekGE: Above the [InternalIterator] interface, the [pebble.Iterator]'s |
| 43 | +// SeekGE method detects consecutive seeks to monotonically increasing keys and |
| 44 | +// examines the current key. If the iterator is already positioned appropriately |
| 45 | +// (at a key ≥ the seek key), it elides the entire seek of the internal |
| 46 | +// iterator. |
| 47 | +// |
| 48 | +// The pebble mergingIter does not perform any TrySeekUsingNext optimization |
| 49 | +// itself, but it must preserve the TrySeekUsingNext contract in its calls to |
| 50 | +// its child iterators because it passes the TrySeekUsingNext flag as-is to its |
| 51 | +// child iterators. It can do this because it always translates calls to its |
| 52 | +// SeekGE and SeekPrefixGE methods as equivalent calls to every child iterator. |
| 53 | +// However there are a few subtleties: |
| 54 | +// |
| 55 | +// - In some cases the calls made to child iterators may only be equivalent |
| 56 | +// within the context of the iterator's visible sequence number. For example, |
| 57 | +// if a range deletion tombstone is present on a level, seek keys propagated |
| 58 | +// to lower-levelled child iterators may be adjusted without violating the |
| 59 | +// transitivity of the TrySeekUsingNext flag and its invariants so long as |
| 60 | +// the mergingIter is always reading state at the same visible sequence |
| 61 | +// number. |
| 62 | +// - The mergingIter takes care to avoid ever advancing a child iterator that's |
| 63 | +// already positioned beyond the current iteration prefix. |
| 64 | +// - When propagating TrySeekUsingNext to its child iterators, the mergingIter |
| 65 | +// must propagate it to all child iterators or none. This is required because |
| 66 | +// of the mergingIter's handling of range deletions. Unequal application of |
| 67 | +// TrySeekUsingNext may cause range deletions that have already been skipped |
| 68 | +// over in a level to go unseen, despite being relevant to other levels that |
| 69 | +// do not use TrySeekUsingNext. |
| 70 | +// |
| 71 | +// The pebble levelIter makes use of the TrySeekUsingNext flag to avoid a naive |
| 72 | +// seek within the level's B-Tree of files. When TrySeekUsingNext is passed by |
| 73 | +// the caller, the relevant key must fall within the current file or a later |
| 74 | +// file. The search space is reduced from (-∞,+∞) to (current file, +∞). If the |
| 75 | +// current file's bounds overlap the key, the levelIter propagates the |
| 76 | +// TrySeekUsingNext to the current sstable iterator. If the levelIter must |
| 77 | +// advance to a new file, it drops the flag because the new file's sstable |
| 78 | +// iterator is still unpositioned. |
| 79 | +// |
| 80 | +// In-memory iterators arenaskl.Iterator and batchskl.Iterator make use of the |
| 81 | +// TrySeekUsingNext flag, attempting a fixed number of Nexts before falling back |
| 82 | +// to performing a seek using skiplist structures. |
| 83 | +// |
| 84 | +// The sstable iterators use the TrySeekUsingNext flag to avoid naive seeks |
| 85 | +// through a table's index structures. See the long comment in |
| 86 | +// sstable/reader_iter.go for more details: |
| 87 | +// - If an iterator is already exhausted, either because there are no |
| 88 | +// subsequent point keys or because the upper bound has been reached, the |
| 89 | +// iterator uses TrySeekUsingNext to avoid any repositioning at all. |
| 90 | +// - Otherwise, a TrySeekUsingNext flag causes the sstable Iterator to Next |
| 91 | +// forward a capped number of times, stopping as soon as a key ≥ the seek key |
| 92 | +// is discovered. |
| 93 | +// - The sstable iterator does not always position itself in response to a |
| 94 | +// SeekPrefixGE even when TrySeekUsingNext()=false, because bloom filters may |
| 95 | +// indicate the prefix does not exist within the file. The sstable iterator |
| 96 | +// takes care to remember when it didn't position itself, so that a |
| 97 | +// subsequent seek using TrySeekUsingNext does NOT try to reuse the current |
| 98 | +// iterator position. |
| 99 | +package base |
0 commit comments