Skip to content

arroy for vectors #2074

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 27 commits into
base: master
Choose a base branch
from
Open

arroy for vectors #2074

wants to merge 27 commits into from

Conversation

ricopinazo
Copy link
Collaborator

@ricopinazo ricopinazo commented May 2, 2025

What changes were proposed in this pull request?

Why are the changes needed?

The old approach of defining custom dcument search queries in python had two problems:

  1. It used the python host as the runtime, leading to potential deadlocks etc
  2. It used to load all the vectorised graphs in the server, which didn't scale

Does this PR introduce any user-facing change? If yes is this documented?

How was this patch tested?

Are there any further changes required?

  • remove heed from deps
  • try to remove all the duplication for tanslating between u32 and entity id
  • double-check having max lmdb size set to 1TB doesnt cause the disk usage to grow for small graphs
  • have a separate module for the db abstraction
  • keep adding benchmarks, use different query for each call in the current one
  • stop loading all graphs for the global plugins
  • Reuse EntityRef for DocumentEntity
  • re-enable save_embeddings function
  • move GqlDocument under graph folder
  • add window parameters to vertor selection and vectorised graph functions in GraphQL
  • reiview if Im actually using async_stream crate and the others
  • double-check Ctrl+C still works
  • decide proper size for the heed env
  • test the miss rate in a real-world scenario: 0.02% misses in a 100k dataset
  • remove all unwraps from cache.rs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant