Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
ef_search
is a parameter consumed at query time which determines how many edges of the HNSW graph are traversed to find approximate nearest neighbors. Setting this hyperparameter allows HNSW to trade recall for speed at query time.The way this was implemented by HNSWlib had some bad design decisions. It was a property of each index, changed by calling
set_ef
on the index; besides being cumbersome, it also creates a data race in concurrent execution where multiple threads want to execute queries with different ef_search.Additionally, it was not written out with the index, and when the index was loaded again, it was set to
10
, creating an unncessary footgun.This PR fixes all of the above. It:
changes the index parameter name to
ef_search_default_
to make it clear what this is for, and renames the function signatures setting it accordingly.adds a shared mutex which is locked when writing, but allows parallel reading.
stores the default when the index is written out, and reads it when it's loaded
adds an argument to the query path allowing each query to set it independently - this is passed by value so it's on the stack of the function call, rather than on the heap. if it's not passed, we read the index default
updates all tests and examples
Making this work required us to go up to C++ 17 and change the mac OS target to be 10.12 or later. These are both ancient, and we compile this for our users anyway.