Skip to content

Weighted RRF #84

@Shanorino

Description

@Shanorino

In MongoDBAtlasHybridSearchRetriever, currently it's not possible to add weights separately to 2 retrievers (in other words, it's always 50-50).

A workaround is to change vector_penalty and fulltext_penalty. However, tuning the RRF parameter k changes the shape of the reciprocal rank function, while multiplying by a weight is a simple linear scaling of the entire curve. These two approaches are not mathematically equivalent and lead to different effects on how each rank contributes to the final score.

One thing to mention is that EnsembleRetriever in Langchain also implemented the weights (instead of changing the penalty constant k).

Thus, I'd propose implementing weights to the class MongoDBAtlasHybridSearchRetriever by adding 2 more stages to the Mongo pipeline, so that it can be instantiated this way:

retriever = MongoDBAtlasHybridSearchRetriever(
vectorstore = vector_store,
search_index_name = "search_index",
top_k = 5,
fulltext_weight=0.3,
vector_weight=0.7,
fulltext_penalty = 50,
vector_penalty = 50
)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions