Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does it support any commercial Vector DB? #3

Open
amitkayal opened this issue Jul 4, 2023 · 2 comments
Open

Does it support any commercial Vector DB? #3

amitkayal opened this issue Jul 4, 2023 · 2 comments

Comments

@amitkayal
Copy link

Hi, I would like to use commercial Vector DB like Pinecone for storing the vector and also wants choose model for embedding generation. Does it allow me such flexibility? I also wanted to know if we can have file level duplicate check to ensure same file does not get processed multuple times. Thanks

@Immortalise
Copy link
Owner

Hello,

At the moment, we do not provide support for commercial Vector DBs. However, you can modify the db.py file to customize it according to your requirements.

Currently, our platform only supports the all-mpnet and CLIP models, as they are state-of-the-art embedding models. We are actively working towards integrating more diverse and flexible models.

We do have a rudimentary duplicate file detection mechanism in place. You can refer to the implementation in anything.py, specifically line 88, for more information.

Contributions to augment our model collection / database would be greatly appreciated!

@davychxn
Copy link
Contributor

@amitkayal Any updates? Have you tried with Pinecone? What's your use case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants