Skip to content

Conversation

gf2121
Copy link
Contributor

@gf2121 gf2121 commented May 26, 2025

This tries to encode ScoreDoc#score and ScoreDoc#doc to a comparable long and use a LongHeap instead of HitQueue. This seems to help apparently when i increase topN = 1000 (mikemccand/luceneutil#357).

Luceneutil (Baseline contains #14709.)

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                       CountTerm    14783.92      (5.0%)    14552.83      (5.3%)   -1.6% ( -11% -    9%) 0.340
             CountFilteredPhrase       22.48      (2.7%)       22.22      (2.2%)   -1.1% (  -5% -    3%) 0.146
                  CountOrHighMed      157.96      (2.5%)      156.91      (2.3%)   -0.7% (  -5% -    4%) 0.375
             CountFilteredOrMany       10.64      (3.8%)       10.60      (3.9%)   -0.4% (  -7% -    7%) 0.737
                        SpanNear        5.93      (4.7%)        5.92      (4.4%)   -0.3% (  -8% -    9%) 0.853
                         Respell       80.44      (2.5%)       80.29      (2.7%)   -0.2% (  -5% -    5%) 0.822
          CountFilteredOrHighMed       49.06      (3.7%)       48.98      (4.4%)   -0.2% (  -7% -    8%) 0.893
                  FilteredPhrase       21.91      (2.2%)       21.88      (2.3%)   -0.1% (  -4% -    4%) 0.847
             CountFilteredIntNRQ       27.50      (2.0%)       27.48      (2.4%)   -0.1% (  -4% -    4%) 0.934
                   TermMonthSort     1768.20      (1.9%)     1768.68      (2.4%)    0.0% (  -4% -    4%) 0.968
                IntervalsOrdered        5.78      (2.7%)        5.78      (2.5%)    0.0% (  -5% -    5%) 0.968
                     CountOrMany       11.93      (4.8%)       11.93      (3.3%)    0.0% (  -7% -    8%) 0.976
                 CountOrHighHigh       88.76      (2.9%)       88.80      (1.4%)    0.0% (  -4% -    4%) 0.955
                 CountAndHighMed      122.70      (3.1%)      122.75      (2.8%)    0.0% (  -5% -    6%) 0.962
                     CountPhrase        5.99      (2.1%)        5.99      (2.2%)    0.1% (  -4% -    4%) 0.911
                CountAndHighHigh       83.78      (2.7%)       83.85      (2.4%)    0.1% (  -4% -    5%) 0.914
         CountFilteredOrHighHigh       39.81      (3.2%)       39.85      (3.8%)    0.1% (  -6% -    7%) 0.927
                AndMedOrHighHigh       29.61      (3.5%)       29.65      (4.9%)    0.1% (  -7% -    8%) 0.913
                          IntNRQ       46.08      (0.5%)       46.17      (0.4%)    0.2% (   0% -    1%) 0.187
                  FilteredIntNRQ       45.99      (0.6%)       46.13      (0.4%)    0.3% (   0% -    1%) 0.066
                  FilteredOrMany        5.15      (1.1%)        5.17      (1.1%)    0.3% (  -1% -    2%) 0.339
                     OrStopWords        7.79      (4.3%)        7.83      (4.1%)    0.6% (  -7% -    9%) 0.669
                          Phrase       10.74      (2.3%)       10.81      (2.8%)    0.6% (  -4% -    5%) 0.478
                      TermDTSort      126.34      (2.3%)      127.25      (2.9%)    0.7% (  -4% -    6%) 0.383
     FilteredAnd2Terms2StopWords      125.37      (2.8%)      126.34      (2.6%)    0.8% (  -4% -    6%) 0.369
                   TermTitleSort       40.28      (3.3%)       40.60      (2.7%)    0.8% (  -4% -    6%) 0.402
                        Wildcard      117.87      (1.6%)      118.94      (2.4%)    0.9% (  -3% -    4%) 0.158
                 FilteredPrefix3      168.60      (1.4%)      170.29      (1.7%)    1.0% (  -2% -    4%) 0.043
             FilteredOrStopWords       17.73      (2.0%)       17.91      (2.4%)    1.0% (  -3% -    5%) 0.137
                         Prefix3      182.05      (1.4%)      184.02      (1.8%)    1.1% (  -2% -    4%) 0.032
             CombinedAndHighHigh        8.58      (1.9%)        8.67      (1.6%)    1.1% (  -2% -    4%) 0.050
                 AndHighOrMedMed       25.76      (1.3%)       26.05      (1.5%)    1.1% (  -1% -    3%) 0.011
            FilteredAndStopWords       19.85      (4.1%)       20.08      (3.3%)    1.2% (  -5% -    8%) 0.318
              FilteredOrHighHigh       26.08      (2.3%)       26.42      (2.2%)    1.3% (  -3% -    5%) 0.062
             FilteredAndHighHigh       26.33      (3.1%)       26.68      (2.9%)    1.3% (  -4% -    7%) 0.161
                      OrHighHigh       20.34      (3.2%)       20.61      (3.1%)    1.3% (  -4% -    7%) 0.183
                          OrMany        6.65      (4.1%)        6.74      (2.5%)    1.4% (  -4% -    8%) 0.188
               FilteredAnd3Terms      172.63      (2.8%)      175.23      (3.0%)    1.5% (  -4% -    7%) 0.101
               TermDayOfYearSort      141.72      (2.5%)      143.88      (2.9%)    1.5% (  -3% -    7%) 0.076
                    SloppyPhrase        2.29      (4.3%)        2.33      (3.3%)    1.5% (  -5% -    9%) 0.208
              CombinedOrHighHigh        9.11      (2.2%)        9.25      (1.6%)    1.6% (  -2% -    5%) 0.010
                    AndStopWords        8.17      (5.6%)        8.30      (4.8%)    1.6% (  -8% -   12%) 0.324
                DismaxOrHighHigh       29.65      (2.9%)       30.16      (2.7%)    1.7% (  -3% -    7%) 0.054
                FilteredOr3Terms       41.95      (1.9%)       42.73      (1.6%)    1.9% (  -1% -    5%) 0.001
              CombinedAndHighMed       37.06      (1.9%)       37.82      (1.9%)    2.0% (  -1% -    5%) 0.001
                     AndHighHigh       17.39      (3.3%)       17.77      (3.1%)    2.2% (  -4% -    8%) 0.033
              FilteredAndHighMed       72.46      (3.1%)       74.05      (3.0%)    2.2% (  -3% -    8%) 0.021
             And2Terms2StopWords       94.17      (3.1%)       96.65      (3.0%)    2.6% (  -3% -    9%) 0.006
      FilteredOr2Terms2StopWords       54.83      (2.3%)       56.39      (1.9%)    2.8% (  -1% -    7%) 0.000
                       And3Terms      114.17      (3.6%)      117.80      (3.0%)    3.2% (  -3% -   10%) 0.002
                      AndHighMed       58.30      (2.8%)       60.19      (2.7%)    3.2% (  -2% -    9%) 0.000
               FilteredOrHighMed       70.82      (2.4%)       73.18      (2.3%)    3.3% (  -1% -    8%) 0.000
               CombinedOrHighMed       38.14      (2.1%)       39.47      (1.7%)    3.5% (   0% -    7%) 0.000
                    FilteredTerm       60.98      (2.1%)       63.12      (2.0%)    3.5% (   0% -    7%) 0.000
                          Fuzzy1       88.06      (1.7%)       91.43      (1.9%)    3.8% (   0% -    7%) 0.000
                 DismaxOrHighMed       59.63      (2.5%)       61.92      (2.1%)    3.8% (   0% -    8%) 0.000
                          Fuzzy2       87.49      (1.7%)       90.93      (1.8%)    3.9% (   0% -    7%) 0.000
                      OrHighRare       66.60      (8.3%)       69.40     (10.5%)    4.2% ( -13% -   24%) 0.159
                          IntSet      610.93      (3.4%)      638.41      (2.7%)    4.5% (  -1% -   11%) 0.000
              Or2Terms2StopWords       74.73      (2.5%)       78.61      (2.6%)    5.2% (   0% -   10%) 0.000
                       OrHighMed       71.79      (2.8%)       75.57      (2.7%)    5.3% (   0% -   11%) 0.000
                        Or3Terms       85.34      (2.6%)       90.24      (2.6%)    5.8% (   0% -   11%) 0.000
                    CombinedTerm       20.60      (2.4%)       21.90      (1.7%)    6.3% (   2% -   10%) 0.000
                      DismaxTerm      329.03      (6.5%)      413.15      (3.5%)   25.6% (  14% -   38%) 0.000
                            Term      305.57      (6.9%)      387.75      (5.6%)   26.9% (  13% -   42%) 0.000

This branch is checkout from #14709 so there is some unrelated diff, i'll mark this draft until #14709 merged.

Copy link
Contributor

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you will stop receiving this reminder on future updates to the PR.

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea makes sense to me, I wouldn't have expected such a speedup when collecting the top 1000 hits, it's great. It's a bit ugly to pass null as a HitQueue in the constructor of TopScoreDocCollector. Can we only keep method signatures on TopDocsCollector and move the current impls to some other class?

@gf2121
Copy link
Contributor Author

gf2121 commented May 27, 2025

Thanks for the suggestion!

It's a bit ugly to pass null as a HitQueue in the constructor of TopScoreDocCollector. Can we only keep method signatures on TopDocsCollector and move the current impls to some other class?

FWIW passing a null PQ is mentioned in TopScoreDocCollector's java doc

* Extending classes can override any of the methods to provide their own implementation, as well as
* avoid the use of the priority queue entirely by passing null to {@link
* #TopDocsCollector(PriorityQueue)}. In that case however, you might want to consider overriding
* all methods, in order to avoid a NullPointerException.
.
I agree it is ugly to copy the large topDocs(int start, int howMany) so i was looking to extract PQ logics to a protected method so that it can be minimum override, but i'm not sure if we should touch (public methods signature not changed but extended classes could be affected) this public API class as this seems not to break the original intention of the design. In case you did not notice the java doc, i'd like to ask your suggestion again :)

@jpountz
Copy link
Contributor

jpountz commented May 27, 2025

I wasn't aware of this indeed. OK for passing null then, I agree that there may be sub classes that rely on this API in the wild.

@gf2121 gf2121 marked this pull request as ready for review May 28, 2025 06:52
@github-actions github-actions bot added this to the 10.3.0 milestone May 28, 2025
* Score is non-negative float so wo use floatToRawIntBits instead of {@link
* NumericUtils#floatToSortableInt}. We do not assert score >= 0 here to allow pass negative float
* to indicate totally non-competitive, e.g. {@link #LEAST_COMPETITIVE_CODE}.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit too subtle to my taste, could we either not have to deal with negative scores at all, or use NumericUtils#floatToSortableInt? FWIW, I believe that LEAST_COMPETITIVE_CODE could use a score of 0 since Integer.MAX_VALUE is not an allowed doc ID?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I believe that LEAST_COMPETITIVE_CODE could use a score of 0 since Integer.MAX_VALUE is not an allowed doc ID?

The issue of using encode(Integer.MAX_VALUE , 0f) is that topScore will be decoded as 0

topScore = DocScoreEncoder.toIntScore(topCode);

Then score 0 will not be competitive as we are comparing score only.

I agree this contract is bit tricky, maybe we should just use NumericUtils#floatToSortableInt.

final int docBase = context.docBase;
final ScoreDoc after = this.after;
final float afterScore;
final int afterScore;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it actually help to track scores as sortable ints rather than floats? I had assumed we'd only encode them if they're between the after score and the top score?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made them int because all scores in collector are only used to compare and the only computation is Math.nextUp which needs to convert float to int :)

I'll revert these change as this does not seem to make sense to you.

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it this way, it looks simpler.

For reference, I was just looking at profiles of the Tantivy benchmark (https://tantivy-search.github.io/bench/) and it tends to run queries with fewer hits in total compared to Lucene's nightly benchmarks, so the overhead of reordering the heap is higher (since a higher percentage of hits trigger a reordering of the heap). So I expect this change to help, and not only term queries.

super(new HitQueue(numHits, true));
super(null);
this.heap = new LongHeap(numHits);
IntStream.range(0, numHits).forEach(_ -> heap.push(DocScoreEncoder.LEAST_COMPETITIVE_CODE));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: a for loop would be a bit more readable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe in a follow-up we can add a new ctor parameter to LongHeap so that it accepts an initial value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old ctor used to create an empty heap so I added a new ctor.

@jpountz
Copy link
Contributor

jpountz commented Jun 4, 2025

Since the top-k heap appears to be a bottleneck for some queries, we could look into whether a radix heap would perform better than a binary heap in a follow-up.

@gf2121
Copy link
Contributor Author

gf2121 commented Jun 5, 2025

Since the top-k heap appears to be a bottleneck for some queries, we could look into whether a radix heap would perform better than a binary heap in a follow-up.

+1, that would be an interesting exploration!

I rechecked benchmark against newest main (topN=1000), i plan to merge soon.

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                    AndStopWords       11.62      (8.3%)       11.35     (11.4%)   -2.4% ( -20% -   19%) 0.582
             FilteredAndHighHigh       20.84      (5.6%)       20.50      (6.6%)   -1.6% ( -13% -   11%) 0.532
            FilteredAndStopWords       17.33      (5.1%)       17.07      (6.5%)   -1.5% ( -12% -   10%) 0.549
                       CountTerm    14997.16      (3.8%)    14843.44      (2.4%)   -1.0% (  -6% -    5%) 0.447
          CountFilteredOrHighMed       49.85      (4.3%)       49.49      (4.4%)   -0.7% (  -9% -    8%) 0.696
                     OrStopWords       11.79      (7.9%)       11.71     (12.6%)   -0.7% ( -19% -   21%) 0.881
              FilteredAndHighMed       64.40      (4.8%)       64.00      (5.8%)   -0.6% ( -10% -   10%) 0.784
         CountFilteredOrHighHigh       40.31      (3.5%)       40.10      (3.7%)   -0.5% (  -7% -    6%) 0.728
                 CountAndHighMed      123.05      (2.3%)      122.52      (3.3%)   -0.4% (  -5% -    5%) 0.722
             CountFilteredPhrase       22.36      (1.7%)       22.27      (1.7%)   -0.4% (  -3% -    3%) 0.553
             CountFilteredOrMany       10.61      (3.9%)       10.57      (4.7%)   -0.4% (  -8% -    8%) 0.816
                CountAndHighHigh       84.09      (2.8%)       83.81      (2.8%)   -0.3% (  -5% -    5%) 0.779
                         Respell       80.92      (2.7%)       80.67      (3.0%)   -0.3% (  -5% -    5%) 0.804
                  CountOrHighMed      157.52      (1.2%)      157.12      (2.3%)   -0.3% (  -3% -    3%) 0.750
                 CountOrHighHigh       88.38      (2.7%)       88.25      (3.4%)   -0.1% (  -6% -    6%) 0.911
                  FilteredOrMany        5.16      (1.6%)        5.16      (1.1%)   -0.0% (  -2% -    2%) 0.961
     FilteredAnd2Terms2StopWords      129.21      (2.8%)      129.21      (4.2%)   -0.0% (  -6% -    7%) 1.000
                   TermTitleSort       42.03      (2.0%)       42.07      (2.0%)    0.1% (  -3% -    4%) 0.922
                       And3Terms      123.28      (5.1%)      123.64      (7.0%)    0.3% ( -11% -   13%) 0.911
                  FilteredPhrase       21.66      (1.8%)       21.76      (1.8%)    0.5% (  -3% -    4%) 0.532
                     AndHighHigh       30.49      (3.0%)       30.66      (2.9%)    0.6% (  -5% -    6%) 0.653
               TermDayOfYearSort      143.65      (2.3%)      144.57      (2.5%)    0.6% (  -4% -    5%) 0.535
                      TermDTSort      126.92      (2.3%)      128.15      (1.5%)    1.0% (  -2% -    4%) 0.249
                          OrMany        6.74      (8.5%)        6.81     (10.1%)    1.0% ( -16% -   21%) 0.806
                          Phrase       10.82      (1.5%)       10.92      (3.1%)    1.0% (  -3% -    5%) 0.346
                        Wildcard      117.72      (1.8%)      118.92      (1.7%)    1.0% (  -2% -    4%) 0.176
             CombinedAndHighHigh        9.55      (2.3%)        9.68      (1.3%)    1.4% (  -2% -    5%) 0.087
                FilteredOr3Terms       40.65      (2.9%)       41.23      (2.1%)    1.4% (  -3% -    6%) 0.181
               FilteredAnd3Terms      180.65      (3.2%)      183.26      (2.8%)    1.4% (  -4% -    7%) 0.259
              FilteredOrHighHigh       25.16      (1.8%)       25.54      (1.4%)    1.5% (  -1% -    4%) 0.024
             And2Terms2StopWords      113.61      (4.1%)      115.35      (6.9%)    1.5% (  -9% -   13%) 0.527
                     CountOrMany       11.61      (5.8%)       11.79      (5.8%)    1.5% (  -9% -   14%) 0.534
                          Fuzzy2       82.73      (3.8%)       84.06      (3.9%)    1.6% (  -5% -    9%) 0.328
                   TermMonthSort     1752.96      (2.5%)     1781.58      (2.2%)    1.6% (  -2% -    6%) 0.104
                 AndHighOrMedMed       28.81      (2.5%)       29.31      (2.8%)    1.7% (  -3% -    7%) 0.130
              CombinedOrHighHigh        9.75      (2.5%)        9.94      (1.9%)    1.9% (  -2% -    6%) 0.043
                    CombinedTerm       20.68      (2.6%)       21.09      (1.6%)    2.0% (  -2% -    6%) 0.029
                      AndHighMed       81.85      (3.4%)       83.52      (3.7%)    2.0% (  -4% -    9%) 0.178
                AndMedOrHighHigh       35.62      (2.0%)       36.35      (2.0%)    2.0% (  -1% -    6%) 0.017
             FilteredOrStopWords       17.06      (3.1%)       17.41      (2.3%)    2.1% (  -3% -    7%) 0.076
                          Fuzzy1       83.47      (4.2%)       85.27      (4.3%)    2.2% (  -6% -   11%) 0.236
                 FilteredPrefix3      166.77      (1.5%)      170.92      (1.4%)    2.5% (   0% -    5%) 0.000
                      OrHighHigh       32.88      (2.3%)       33.75      (4.0%)    2.6% (  -3% -    9%) 0.057
              CombinedAndHighMed       42.14      (1.8%)       43.29      (1.2%)    2.7% (   0% -    5%) 0.000
                         Prefix3      179.69      (1.6%)      184.81      (1.6%)    2.8% (   0% -    6%) 0.000
                    FilteredTerm       61.22      (1.9%)       63.11      (2.5%)    3.1% (  -1% -    7%) 0.001
      FilteredOr2Terms2StopWords       53.51      (2.5%)       55.30      (1.6%)    3.3% (   0% -    7%) 0.000
               FilteredOrHighMed       68.63      (1.8%)       70.99      (1.4%)    3.4% (   0% -    6%) 0.000
                DismaxOrHighHigh       40.56      (1.8%)       42.14      (2.7%)    3.9% (   0% -    8%) 0.000
               CombinedOrHighMed       41.70      (2.3%)       43.46      (1.9%)    4.2% (   0% -    8%) 0.000
                        Or3Terms       88.16      (6.0%)       93.47      (9.5%)    6.0% (  -8% -   22%) 0.076
                 DismaxOrHighMed       71.65      (1.5%)       76.25      (2.3%)    6.4% (   2% -   10%) 0.000
              Or2Terms2StopWords       85.14      (4.8%)       91.00      (8.2%)    6.9% (  -5% -   20%) 0.016
                       OrHighMed      105.19      (1.9%)      115.17      (3.2%)    9.5% (   4% -   14%) 0.000
                      OrHighRare       67.59     (10.4%)       76.68      (7.8%)   13.4% (  -4% -   35%) 0.001
                      DismaxTerm      325.96      (6.5%)      407.04      (5.6%)   24.9% (  12% -   39%) 0.000
                            Term      297.83      (6.8%)      373.92      (7.9%)   25.5% (  10% -   43%) 0.000

@gf2121 gf2121 merged commit a309bd6 into apache:main Jun 6, 2025
7 checks passed
@jpountz
Copy link
Contributor

jpountz commented Jun 7, 2025

There seems to be some good speedups with topN=100 already: https://benchmarks.mikemccandless.com/Term.html.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants