Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Add similarity threshold filter #111

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

dvejsada
Copy link

As LibreChat performs file search by looping over all documents available to the respective endpoint, in case of many documents (e.g. larger agent knowledge base) it returns large results even if most of it is not at all relevant. This causes a lot of input tokens for LLM and high API usage price in case of using more advanced models.

This PR introduces an option to set similarity threshold. All results over the similarity threshold will be filtered and not provided back to LibreChat.

If the similarity threshold is not set, a default value of 1 will be used which means nothing will be filtered out. Therefore, there is no breaking change to existing deployments.

@dvejsada
Copy link
Author

Closes #109

@dvejsada
Copy link
Author

Also mitigates this

@thoj
Copy link

thoj commented Jan 23, 2025

I think maybe we also need a MAX_RESULT config to limit the number of results returned. My use case is probably a little wired but I want to search several hundred files. Currently file search returns up to 4 results per file this fills my context window instantly. The best way is probably in my case is probably to first filter by relevance like #111 then only return the most MAX_RESULT relevant results.

@dvejsada
Copy link
Author

I think maybe we also need a MAX_RESULT config to limit the number of results returned. My use case is probably a little wired but I want to search several hundred files. Currently file search returns up to 4 results per file this fills my context window instantly. The best way is probably in my case is probably to first filter by relevance like #111 then only return the most MAX_RESULT relevant results.

I believe this would have to be implemented ať LibreChat(client) level. To my understanding, LibreChat sends queries to rag api per each attached file, so there is no way to limit total output for all files on rag api level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants