Skip to content

Conversation

thecoop
Copy link
Contributor

@thecoop thecoop commented May 23, 2025

Related to #11338 (comment), this replaces some simple uses of PriorityQueue subclasses with ones using a Comparator. This reduces the boilerplate required, the number of classes defined, and moves the definition of the PriorityQueue close to where it is used, so the relevant code is more self-contained and easier to understand.

entries.writeVInt(numFiles);
// first put files in ascending size order so small files fit more likely into one page
SizedFileQueue pq = new SizedFileQueue(numFiles);
List<SizedFile> files = new ArrayList<>(numFiles);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one doesn't need to use a PriorityQueue at all

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorting should also be faster here. The Lucene PQ is only needed when items in queue should fall out at botton when its full. In other cases the ln-overhaed is larger than a simple sorting at end.

Copy link
Contributor

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you will stop receiving this reminder on future updates to the PR.

1 similar comment
Copy link
Contributor

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you will stop receiving this reminder on future updates to the PR.

Missed a comparator
@thecoop thecoop force-pushed the priority-queue-comparators branch from 1f07b03 to f53ae10 Compare May 23, 2025 16:43
Copy link
Contributor

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you will stop receiving this reminder on future updates to the PR.

@thecoop
Copy link
Contributor Author

thecoop commented May 23, 2025

Next stage is to run some perf tests to check this doesn't introduce a slowdown due to changes to the virtual method calls in key hotspots.

Copy link
Contributor

@uschindler uschindler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Why not make PQ final and pass a comparator or maybe a simpler functional interface directly in ctor. This would reduce the number of subclasses more and we only have a single PQ which is final on top.

@uschindler
Copy link
Contributor

This reduces the boilerplate required, the number of classes defined,...

But it does not reduce the number of loaded classes. The lambdas are still hidden classes.

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it, it looks more like idiomatic Java. Can you check if OrdinalMapBenchmark reports any slowdown? This is the only benchmark that could be impacted by this change that I can think of.

It's slightly awkward that a PriorityQueue can be defined either using a comparator or overriding lessThan, let's only support providing a comparator as Uwe suggested?

Copy link
Contributor

github-actions bot commented Jun 3, 2025

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you will stop receiving this reminder on future updates to the PR.

Copy link
Contributor

@dweiss dweiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. I feel tempted to make PriorityQueue final and then move the comparator to a final field. Why bother having both ways of defining it?

@thecoop
Copy link
Contributor Author

thecoop commented Jun 3, 2025

This is only the first pass, covering single-line comparators. There are several non-trivial implementations that will require more detailed refactoring (if that is indeed desirable) - using a Comparator requires differentiating equality, which lessThan does not require. There's also some that are part of the public API. I have a subsequent PR covering multi-line comparators, and then several to follow for non-trivial implementations.

@dweiss
Copy link
Contributor

dweiss commented Jun 3, 2025

Good point, thanks.

@dweiss
Copy link
Contributor

dweiss commented Jun 3, 2025

We could also declare a custom interface with just lessThan and then provide an adapter in usingComparator... this would dodge the need for implementing full equality checks across the board and you could still use jdk's Comparator utilities, where things are simple.

@thecoop
Copy link
Contributor Author

thecoop commented Jun 3, 2025

That's a good idea - I'll look at that when I'm moving onto the more complex implementations

@uschindler
Copy link
Contributor

We should really only have one final PQ implementation to make sure most of the code can be inlined.

@uschindler
Copy link
Contributor

Basically the lessThan interface should be marked @FunctionalInterface and wherever possible the lessTahn check should be a lambda or method ref.

@thecoop
Copy link
Contributor Author

thecoop commented Jun 3, 2025

That's the desired end state, yes. There's still plenty of implementations to get through before we can make it final.

@thecoop thecoop requested a review from dweiss June 5, 2025 10:07
Copy link
Contributor

@dweiss dweiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to see what the benchmarks show after this change. These pq's are sort of crucial to performance. I wonder if we can observe any change after subclasses are replaced by delegation.

In general, I'm fine to making baby steps although it could also be developed into a larger patch that adds this additional "smaller-than" interface and makes the PQ final. I wonder what others think.

@thecoop
Copy link
Contributor Author

thecoop commented Jun 5, 2025

I wasn't able to run OrdinalMapBenchmark, but a few lucene-bench runs show no large changes - several percent +/- either way

@jpountz
Copy link
Contributor

jpountz commented Jun 5, 2025

For reference, there is an interesting PR at #14714 that replaces PriorityQueue with a LongHeap when computing top-k hits by score. Results suggest that the PriorityQueue is not a bottleneck at all when k=100, but is a bottleneck for our fastest queries when k=1000. This PR doesn't change computing top-k hits by field, so tasks TermMonthSort, TermTitleSort, TermDayOfYearSort or TermDTSort would still use a PriorityQueue to track top hits.

@jpountz
Copy link
Contributor

jpountz commented Jun 5, 2025

I wasn't able to run OrdinalMapBenchmark

I ran it on my machine. Got the following results on main (2 runs):

id: 651.43124 msec
name: 955.36439 msec
country_code: 0.37753 msec
time_zone: 1.03004 msec
id: 605.82902 msec
name: 923.61305 msec
country_code: 0.30129 msec
time_zone: 0.87078 msec

And the following results on your branch (2 runs as well):

id: 641.80320 msec
name: 945.66611 msec
country_code: 0.31938 msec
time_zone: 0.89374 msec
id: 631.90609 msec
name: 970.12461 msec
country_code: 0.32877 msec
time_zone: 0.91463 msec

So it looks fine to me as far as this benchmark is concerned.

@jpountz
Copy link
Contributor

jpountz commented Jun 5, 2025

In general, I'm fine to making baby steps although it could also be developed into a larger patch that adds this additional "smaller-than" interface and makes the PQ final. I wonder what others think.

My intuition was that this "smaller-than" interface idea you brought up would be relatively easy to introduce to all our PriorityQueue impls, so it would be nice if we could skip the intermediate state where there are two ways to define how heap entries are ordered. But maybe I'm underestimating the effort.

@thecoop
Copy link
Contributor Author

thecoop commented Jun 5, 2025

There are some non-trivial subclasses (TopDocs.MergeSortQueue, MultiTermsEnum.TermMergeQueue, NearSpansUnordered.SpanTotalLengthEndPositionWindow), some queues that themselves are subclassed (FieldValueHitQueue, TopOrdAndNumberQueue), and some part of the public API (HitQueue, FieldValueHitQueue, SuggestWordQueue). This PR handles the trivial cases, I was going to create subsequent PRs to cover the more complex refactorings so they can be considered separately, especially those that change the public API.

Copy link
Contributor

@dweiss dweiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with merging, given the explanation. Please add a note to changes and migration? Seems like something worth mentioning.

@jpountz
Copy link
Contributor

jpountz commented Jun 5, 2025

I'm fine with merging too.

@github-actions github-actions bot added this to the 11.0.0 milestone Jun 5, 2025
@dweiss dweiss merged commit 9afcfdb into apache:main Jun 5, 2025
7 checks passed
@dweiss
Copy link
Contributor

dweiss commented Jun 5, 2025

Merged. Thank you, @thecoop

@thecoop thecoop deleted the priority-queue-comparators branch June 5, 2025 15:35
@uschindler
Copy link
Contributor

Thanks! Cool first step!

@jpountz
Copy link
Contributor

jpountz commented Jun 27, 2025

FYI it looks like this change may be responsible for the slowdown on the CountOrMany and TermTitleSort tasks: https://benchmarks.mikemccandless.com/2025.06.05.18.05.16.html.

If you're curious what classes these tasks relate too, it's BooleanScorer for CountOrMany and TermOrdValComparator for TermTitleSort, which both rely on PriorityQueues.

@dweiss
Copy link
Contributor

dweiss commented Jun 28, 2025

When I look at it over time it doesn't seem like there is a consistent slowdown though?
https://benchmarks.mikemccandless.com/TermTitleSort.html

@jpountz
Copy link
Contributor

jpountz commented Jun 28, 2025

Capture d’écran 2025-06-28 à 08 49 55

It looks consistent to me? The slowdown happened between annotations IN and IO, after this change was merged. Performance then improved with IP, but this was an unrelated change, it looks like this change introduced extra overhead and we'd get a speedup if we fix it.

Maybe it's just about finishing the change and always requiring a comparator so that we don't chain 3 virtual calls (LessThan#lessThan, Comparator#compare, Function#apply) for every comparison like today.

@dweiss
Copy link
Contributor

dweiss commented Jun 28, 2025

Ah, ok... Yes, could be, could be. It'll be tricky to optimize for whatever c2 is going to decide to do. I agree the Comparator abstraction - while very neat from source code point of view - may result in different optimizing decisions at runtime. I wonder if we implemented new lessThan interface in some places (instead of the comparator chain) it'd help.

@uschindler
Copy link
Contributor

I would definitely go away from Comparator and use the LessThan interface as the semantics are a bit different. We could supply a wrapper, but where it is simple to do (and does not need the Comparator static magic) I'd prefer a native interface.

@jpountz
Copy link
Contributor

jpountz commented Jun 28, 2025

That would work for me.

@thecoop
Copy link
Contributor Author

thecoop commented Jun 30, 2025

The last set of classes needed to do that are #14817 (comment) - I've raised #14872 for discussion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants