Merge branch 'main' into patch-1

HelgeSverre · web-flow · commit 762e0b8e171e · 2024-01-09T07:14:56.000+01:00
diff --git a/docs/embeddings/cohere.md b/docs/embeddings/cohere.md
@@ -21,6 +21,7 @@ Chroma also provides a convenient wrapper around Cohere's embedding API. This em
 This embedding function relies on the `cohere` python package, which you can install with `pip install cohere`.
 
 ```python
+import chromadb.utils.embedding_functions as embedding_functions
 cohere_ef  = embedding_functions.CohereEmbeddingFunction(api_key="YOUR_API_KEY",  model_name="large")
 cohere_ef(texts=["document1","document2"])
 ```
diff --git a/docs/embeddings/google-gemini.md b/docs/embeddings/google-gemini.md
@@ -24,8 +24,7 @@ This embedding function relies on the `google-generativeai` python package, whic
 
 ```python
 # import
-import chromadb
-from chromadb.utils import embedding_functions
+import chromadb.utils.embedding_functions as embedding_functions
 
 # use directly
 google_ef  = embedding_functions.GoogleGenerativeAiEmbeddingFunction(api_key="YOUR_API_KEY")
diff --git a/docs/embeddings/google-palm.md b/docs/embeddings/google-palm.md
@@ -8,6 +8,7 @@
 To use the PaLM embedding API, you must have `google.generativeai` Python package installed and have the API key. To use:
 
 ```python
+import chromadb.utils.embedding_functions as embedding_functions
 palm_embedding = embedding_functions.GooglePalmEmbeddingFunction(
     api_key=api_key, model=model_name)
 
diff --git a/docs/embeddings/hugging-face.md b/docs/embeddings/hugging-face.md
@@ -21,6 +21,7 @@ Chroma also provides a convenient wrapper around HuggingFace's embedding API. Th
 This embedding function relies on the `requests` python package, which you can install with `pip install requests`.
 
 ```python
+import chromadb.utils.embedding_functions as embedding_functions
 huggingface_ef = embedding_functions.HuggingFaceEmbeddingFunction(
     api_key="YOUR_API_KEY",
     model_name="sentence-transformers/all-MiniLM-L6-v2"
diff --git a/docs/embeddings/instructor.md b/docs/embeddings/instructor.md
@@ -9,10 +9,12 @@ There are three models available. The default is `hkunlp/instructor-base`, and f
 
 ```python
 #uses base model and cpu
+import chromadb.utils.embedding_functions as embedding_functions
 ef = embedding_functions.InstructorEmbeddingFunction() 
 ```
 or
 ```python
+import chromadb.utils.embedding_functions as embedding_functions
 ef = embedding_functions.InstructorEmbeddingFunction(
 model_name="hkunlp/instructor-xl", device="cuda")
 ```
diff --git a/docs/embeddings/jinaai.md b/docs/embeddings/jinaai.md
@@ -21,6 +21,7 @@ Chroma provides a convenient wrapper around JinaAI's embedding API. This embeddi
 This embedding function relies on the `requests` python package, which you can install with `pip install requests`.
 
 ```python
+import chromadb.utils.embedding_functions as embedding_functions
 jinaai_ef = embedding_functions.JinaEmbeddingFunction(
                 api_key="YOUR_API_KEY",
                 model_name="jina-embeddings-v2-base-en"
diff --git a/docs/embeddings/openai.md b/docs/embeddings/openai.md
@@ -21,7 +21,7 @@ Chroma provides a convenient wrapper around OpenAI's embedding API. This embeddi
 This embedding function relies on the `openai` python package, which you can install with `pip install openai`.
 
 ```python
-import chromadb.utils.embedding_functions
+import chromadb.utils.embedding_functions as embedding_functions
 openai_ef = embedding_functions.OpenAIEmbeddingFunction(
                 api_key="YOUR_API_KEY",
                 model_name="text-embedding-ada-002"
@@ -30,6 +30,7 @@ openai_ef = embedding_functions.OpenAIEmbeddingFunction(
 
 To use the OpenAI embedding models on other platforms such as Azure, you can use the `api_base` and `api_type` parameters: 
 ```python
+import chromadb.utils.embedding_functions as embedding_functions
 openai_ef = embedding_functions.OpenAIEmbeddingFunction(
                 api_key="YOUR_API_KEY",
                 api_base="YOUR_API_BASE_PATH",
diff --git a/docs/integrations/haystack.md b/docs/integrations/haystack.md
@@ -0,0 +1,81 @@
+---
+slug: /integrations/haystack
+title: 💙 Haystack
+---
+
+[Haystack](https://github.com/deepset-ai/haystack) is an open-source LLM framework in Python. It provides [embedders](https://docs.haystack.deepset.ai/v2.0/docs/embedders), [generators](https://docs.haystack.deepset.ai/v2.0/docs/generators) and [rankers](https://docs.haystack.deepset.ai/v2.0/docs/rankers) via a number of LLM providers, tooling for [preprocessing](https://docs.haystack.deepset.ai/v2.0/docs/preprocessors) and data preparation, connectors to a number of vector databases including Chroma and more. Haystack allows you to build custom LLM applications using both components readily available in Haystack and [custom components](https://docs.haystack.deepset.ai/v2.0/docs/custom-components). Some of the most common applications you can build with Haystack are retrieval-augmented generation pipelines (RAG), question-answering and semantic search.
+
+<img src="https://img.shields.io/github/stars/deepset-ai/haystack.svg?style=social&label=Star&maxAge=2400"/>
+
+|[Docs](https://docs.haystack.deepset.ai/v2.0/docs) | [Github](https://github.com/deepset-ai/haystack) | [Haystack Integrations](https://haystack.deepset.ai/integrations) | [Tutorials](https://haystack.deepset.ai/tutorials) | 
+
+You can use Chroma together with Haystack by installing the integration and using the `ChromaDocumentStore`
+
+### Installation
+
+```bash
+pip install chroma-haystack
+```
+
+### Usage
+
+- The [Chroma Integration page](https://haystack.deepset.ai/integrations/chroma-documentstore)
+- [Chroma + Haystack Example](https://colab.research.google.com/drive/1YpDetI8BRbObPDEVdfqUcwhEX9UUXP-m?usp=sharing)
+
+#### Write documents into a ChromaDocumentStore
+
+```python
+import os
+from pathlib import Path
+
+from haystack import Pipeline
+from haystack.components.converters import TextFileToDocument
+from haystack.components.writers import DocumentWriter
+from chroma_haystack import ChromaDocumentStore
+
+file_paths = ["data" / Path(name) for name in os.listdir("data")]
+
+document_store = ChromaDocumentStore()
+
+indexing = Pipeline()
+indexing.add_component("converter", TextFileToDocument())
+indexing.add_component("writer", DocumentWriter(document_store))
+
+indexing.connect("converter", "writer")
+indexing.run({"converter": {"sources": file_paths}})
+```
+
+#### Build RAG on top of Chroma
+
+```python
+from chroma_haystack.retriever import ChromaQueryRetriever
+from haystack.components.generators import HuggingFaceTGIGenerator
+from haystack.components.builders import PromptBuilder
+
+prompt = """
+Answer the query based on the provided context.
+If the context does not contain the answer, say 'Answer not found'.
+Context: 
+{% for doc in documents %}
+  {{ doc.content }}
+{% endfor %}
+query: {{query}}
+Answer:
+"""
+prompt_builder = PromptBuilder(template=prompt)
+
+llm = HuggingFaceTGIGenerator(model="mistralai/Mixtral-8x7B-Instruct-v0.1", token='YOUR_HF_TOKEN')
+llm.warm_up()
+retriever = ChromaQueryRetriever(document_store)
+
+querying = Pipeline()
+querying.add_component("retriever", retriever)
+querying.add_component("prompt_builder", prompt_builder)
+querying.add_component("llm", llm)
+
+querying.connect("retriever.documents", "prompt_builder.documents")
+querying.connect("prompt_builder", "llm")
+
+results = querying.run({"retriever": {"queries": [query], "top_k": 3},
+                        "prompt_builder": {"query": query}})
+```
diff --git a/docs/integrations/index.md b/docs/integrations/index.md
@@ -19,6 +19,7 @@ We welcome pull requests to add new Integrations to the community.
 | [Braintrust](/integrations/braintrust) | ✅  | ✅ |
 | [🔭 OpenLLMetry](/integrations/openllmetry) | ✅     | :soon: |
 | [🎈 Streamlit](/integrations/streamlit) | ✅     | ➖ |
+| [💙 Haystack](/integrations/haystack) | ✅     | ➖ |
 
 *Coming soon* - integrations with LangSmith, JinaAI, and more.
 
diff --git a/docs/intro.md b/docs/intro.md
@@ -75,7 +75,8 @@ Continue with the full [getting started guide](./getting-started.md).
 | Rust | ➖ | ✅ [from @Anush008](https://crates.io/crates/chromadb) |
 | Elixir | ➖ | ✅ [from @3zcurdia](https://hex.pm/packages/chroma/) |
 | Dart | ➖ | ✅ [from @davidmigloz](https://pub.dev/packages/chromadb) |
-| PHP | ➖ | ✅ [from @HelgeSverre](https://github.com/helgeSverre/chromadb)                                                            |
+| PHP | ➖ | ✅ [from @CodeWithKyrian](https://github.com/CodeWithKyrian/chromadb-php) |
+| PHP (Laravel) | ➖ | ✅ [from @HelgeSverre](https://github.com/helgeSverre/chromadb)                                                            |
 | Other?       | ❓    | ❓            |
 
 <br/>
diff --git a/docs/usage-guide.md b/docs/usage-guide.md
@@ -802,8 +802,8 @@ Set the following environment variables:
 
 ```bash
 export CHROMA_SERVER_AUTH_CREDENTIALS_FILE="server.htpasswd"
-export CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER='chromadb.auth.providers.HtpasswdFileServerAuthCredentialsProvider'
-export CHROMA_SERVER_AUTH_PROVIDER='chromadb.auth.basic.BasicAuthServerProvider'
+export CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER="chromadb.auth.providers.HtpasswdFileServerAuthCredentialsProvider"
+export CHROMA_SERVER_AUTH_PROVIDER="chromadb.auth.basic.BasicAuthServerProvider"
 ```
 
 And run the server as normal:
diff --git a/sidebars.js b/sidebars.js
@@ -68,6 +68,7 @@ const sidebars = {
         'integrations/braintrust',
         'integrations/openllmetry',
         'integrations/streamlit',
+        'integrations/haystack',
       ],
     },
   ],