You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now we do not have a strong abstraction in place for indices which are used by the Retrieve operator.
As a result, we require the user to do some heavy lifting in writing a search_func() which uses their index and returns results to the RetreiveOp.
In an effort to mitigate this heavy lifting -- and to make it easier for us to program against a standardized class -- this issue aims to add a BaseIndex class, along with sub-classes for ChromaIndex and RagatouilleIndex.
These indices will expose:
a __str__ function which can replace the need for index_helpers.py
a query function, which takes a query: str | list[str] and a results_per_query: int and returns a list | list[list] with the top results_per_query results for each query.
This issue will also standardize the semantics of the search_func. If the user's search function returns a list[str], then RetrieveOp will take the top-k elements from that list. If the user's search function returns a list[list[str]] then RetrieveOp will take the top-k elements from each sub-list.
The text was updated successfully, but these errors were encountered:
Based on conversation w/ @sivaprasadsudhir, we will rethink the interface for Retrieve for the longer-term -- but for now we have a short-term solution.
RAGPretrainedModel can only perform search w/string input (not embedding(s))
Chromadb.Collection may not have the correct model name if the user creates the index w/out specifying the embedding_function (which you technically don't need to do if you are only querying the index)
Right now we do not have a strong abstraction in place for indices which are used by the
Retrieve
operator.As a result, we require the user to do some heavy lifting in writing a
search_func()
which uses their index and returns results to theRetreiveOp
.In an effort to mitigate this heavy lifting -- and to make it easier for us to program against a standardized class -- this issue aims to add a
BaseIndex
class, along with sub-classes forChromaIndex
andRagatouilleIndex
.These indices will expose:
__str__
function which can replace the need forindex_helpers.py
query
function, which takes aquery: str | list[str]
and aresults_per_query: int
and returns alist | list[list]
with the topresults_per_query
results for each query.This issue will also standardize the semantics of the
search_func
. If the user's search function returns alist[str]
, thenRetrieveOp
will take the top-k elements from that list. If the user's search function returns alist[list[str]]
thenRetrieveOp
will take the top-k elements from each sub-list.The text was updated successfully, but these errors were encountered: