Skip to content

Conversation

@kalaspuffar
Copy link

@kalaspuffar kalaspuffar commented Aug 18, 2025

This is a PR as per the suggestion from danny-avila/LibreChat#9102

This will add an endpoint /rerank in order to use open source models to rerank documents. The endpoint needs a query to rerank against and documents to rank. We can also add information on how many results we need, k, and a configuration to set the model and keys in order to run this operation.

All available configuration options could be found over at https://github.com/AnswerDotAI/rerankers, which this endpoint is a thin wrapper over.

Test call

curl -s http://localhost:8000/rerank \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer YOUR_JWT_TOKEN' \
  -d '{
    "query": "I love you",
    "docs": ["I hate you", "I really like you"],
    "k": 5
  }'

Expected response:

[{"text":"I really like you","score":-1.537894606590271},{"text":"I hate you","score":-4.30911111831665}]

Realized that sending the model over the call is not the correct option, we need to load it one time to improve performance so now you can configure that in the environment for the rag_api repository.

SIMPLE_RERANKER_MODEL_NAME = "mixedbread-ai/mxbai-rerank-large-v1"
SIMPLE_RERANKER_MODEL_TYPE = "cross-encoder"
#SIMPLE_RERANKER_MODEL_NAME = "ms-marco-MiniLM-L-12-v2"
#SIMPLE_RERANKER_MODEL_NAME = "flashrank"
#SIMPLE_RERANKER_MODEL_TYPE = "colbert"
SIMPLE_RERANKER_LANG = ""
SIMPLE_RERANKER_API_PROVIDER = ""
SIMPLE_RERANKER_API_KEY = ""

@kalaspuffar
Copy link
Author

Force push was due to black linting.

All done! ✨ 🍰 ✨
1 file reformatted, 1 file left unchanged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant