Skip to content
This repository was archived by the owner on May 10, 2024. It is now read-only.

Add docs for universal sentence encoder embedding function #219

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/embeddings.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ Chroma provides lightweight wrappers around popular embedding providers, making
| [Hugging Face Embedding Server](/embeddings/hugging-face-embedding-server) | ✅ | ✅ |
| [Jina AI](/embeddings/jinaai) | ✅ | ✅ |
| [Roboflow](/embeddings/roboflow-api) | ✅ | ➖ |
| [Universal Sentence Encoder](/embeddings/universal-sentence-encoder) | ✅ | ➖ |


We welcome pull requests to add new Embedding Functions to the community.

Expand Down
44 changes: 44 additions & 0 deletions docs/embeddings/universal-sentence-encoder.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
---
---

# Universal Sentence Encoder

import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

<div class="select-language">Select a language</div>

<Tabs queryString groupId="lang">
<TabItem value="py" label="Python"></TabItem>
<TabItem value="js" label="JavaScript"></TabItem>
</Tabs>


<Tabs queryString groupId="lang" className="hideTabSwitcher">
<TabItem value="py" label="Python">

Chroma also provides a convenient wrapper around [Universal Sentence Encoder](https://research.google.com/pubs/archive/46808.pdf)

This embedding function uses models hosted on [Tensorflow Hub](https://tfhub.dev/).

This embedding function relies on the `tensforflow_hub` python package, which you can install with `pip install tensforflow_hub`.

```python
import chromadb.utils.embedding_functions as embedding_functions
huggingface_ef = embedding_functions.UniversalSentenceEncoderEmbeddingFunction()

huggingface_ef([
"The quick brown fox jumps over the lazy dog.",
"I am a sentence for which I would like to get its embedding"])

```


You can pass in an optional `model_name` argument, which lets you choose which model to use. By default, Chroma uses [Universal Sentence Encoder 4](https://tfhub.dev/google/universal-sentence-encoder/4) provided by Tensorflow Hub
</TabItem>
<TabItem value="js" label="JavaScript">

Support for [Universal Sentence Encoder](https://research.google.com/pubs/archive/46808.pdf) embedding function is not implemented yet. Feel free to contribute by following the doc: [Custom Embedding Functions](https://docs.trychroma.com/embeddings?lang=js)

</TabItem>
</Tabs>
2 changes: 1 addition & 1 deletion sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -93,8 +93,8 @@ const sidebars = {
'embeddings/hugging-face-embedding-server',
'embeddings/instructor',
'embeddings/roboflow-api',
'embeddings/hugging-face-embedding-server',
'embeddings/jinaai',
'embeddings/universal-sentence-encoder',
],
},
],
Expand Down