Skip to content

[WIP] feat: add technical docs #543

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 11 commits into
base: develop
Choose a base branch
from
19 changes: 18 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,13 @@ Options**

## Architecture

![Sequence diagram of Edge Runtime request flow](assets/edge-runtime-diagram.svg?raw=true)
<p align="center">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="/assets/edge-runtime-diagram-dark.svg">
<source media="(prefers-color-scheme: light)" srcset="/assets/edge-runtime-diagram.svg">
<img alt="Sequence diagram of Edge Runtime request flow" src="/assets/edge-runtime-diagram.svg" style="max-width: 100%;">
</picture>
</p>

The edge runtime can be divided into two runtimes with different purposes.

Expand All @@ -32,6 +38,17 @@ The edge runtime can be divided into two runtimes with different purposes.
- Limits are required to be set such as: Memory and Timeouts.
- Has access to environment variables explictly allowed by the main runtime.

### Edge Runtime in Deep

#### Conceptual

- [EdgeRuntime Base](/crates/base/README.md): Overalls about how EdgeRuntime is based on Deno.

#### Extension Modules

- [AI](/ext/ai/README.md): Implements AI related features.
- [NodeJs](/ext/node/README.md) & [NodeJs Polyfills](/ext/node/polyfills/README.md): Implements the NodeJs compatibility layer.

## Developers

To learn how to build / test Edge Runtime, visit [DEVELOPERS.md](DEVELOPERS.md)
Expand Down
4 changes: 4 additions & 0 deletions assets/docs/ai/onnx-backend-dark.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions assets/docs/ai/onnx-backend.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions assets/edge-runtime-diagram-dark.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
25 changes: 4 additions & 21 deletions assets/edge-runtime-diagram.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file added crates/base/README.md
Empty file.
105 changes: 105 additions & 0 deletions ext/ai/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
# Supabase AI module

This crate is part of the Supabase Edge Runtime stack and implements AI related
features for the `Supabase.ai` namespace.

## Model Execution Engine

<p align="center">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="/assets/docs/ai/onnx-backend-dark.svg">
<source media="(prefers-color-scheme: light)" srcset="/assets/docs/ai/onnx-backend.svg">
<img alt="ONNX Backend illustration" src="/assets/docs/ai/onnx-backend.svg" width="350" style="max-width: 100%;">
</picture>
</p>

`Supabase.ai` uses [onnxruntime](https://onnxruntime.ai/) as internal model
execution engine, backend by [ort pyke](https://ort.pyke.io/) rust bindings.

The **onnxruntime** API is available from `globalThis` and shares similar specs of [onnxruntime-common](https://github.com/microsoft/onnxruntime/tree/main/js/common).

The available items are:

- `Tensor`: Represent a basic tensor with specified dimensions and data type. - "The AI input/output"
- `InferenceSession`: Represent the inner model session. - "The AI model itself"

<details>
<summary>Usage</summary>

It can be used from the exported `globalThis[Symbol.for("onnxruntime")]` -
but manipulating it directly is not trivial, so in the future you may use the [Inference API #501](https://github.com/supabase/edge-runtime/pull/501) for a more user friendly API.

```typescript
const { InferenceSession, Tensor } = globalThis[Symbol.for("onnxruntime")];

// 'create()' supports an url string buffer or the binary data
const modelUrlBuffer = new TextEncoder().encode("https://huggingface.co/Supabase/gte-small/resolve/main/onnx/model_quantized.onnx");
const session = await InferenceSession.create(modelUrlBuffer);

// Example only, in real 'feature-extraction' tensors must be created from the tokenizer step.
const inputs = {
input_ids: new Tensor('float32', [1, 2, 3...], [1, 384]),
attention_mask: new Tensor('float32', [...], [1, 384]),
token_types_ids: new Tensor('float32', [...], [1, 384])
};

const { last_hidden_state } = await session.run(inputs);
console.log(last_hidden_state);
```

</details>

### Third party libs

Originaly this backend was created to implicit integrate with [transformers.js](https://github.com/huggingface/transformers.js/). This way users can still consuming a high-level lib at same time they benefits of all Supabase's Model Execution Engine features, like model optimization and caching.
For further information please check the [PR #436](https://github.com/supabase/edge-runtime/pull/436) as well the [tests folder](/crates/base/test_cases/ai-ort-rust-backend/transformers-js)

> [!WARNING]
> At this moment users need to explicit target `device: 'auto'` to enable the platform compatibility.

```typescript
import { env, pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/[email protected]';

// Broswer cache is now supported for `onnx` models
env.useBrowserCache = true;
env.allowLocalModels = false;

const pipe = await pipeline('feature-extraction', 'supabase/gte-small', { device: 'auto' });

const output = await pipe("This embed will be generated from rust land", {
pooling: 'mean',
normalize: true
});
```

### Self-Hosting

**Caching filepath**:
The `EXT_AI_CACHE_DIR` environment variable can be use to set a custom cache path

**Memory clean up**:
For Self-Hosting users an extra method is available for `main/index.ts` scope and should be used to clean up unused sessions, consider adding it into your main entrypoint file:

```typescript
// cleanup unused sessions every 30s
setInterval(async () => {
try {
const cleanupCount = await EdgeRuntime.ai.tryCleanupUnusedSession();
if (cleanupCount == 0) {
return;
}
console.log('EdgeRuntime.ai.tryCleanupUnusedSession', cleanupCount);
} catch (e) {
console.error(e.toString());
}
}, 30 * 1000);
```

## The `Session` class

Prior versions has [introduced](https://supabase.com/blog/ai-inference-now-available-in-supabase-edge-functions) the `Session` class as alternative to `transformers.js` for *gte-small* model and then was used to provide a [LLM interface](https://supabase.com/docs/guides/functions/ai-models?queryGroups=platform&platform=ollama#using-large-language-models-llm) for Ollama and some other providers.

Since the **Model Execution Engine** was created the `Session` class now can focus on LLM interface while the `Session('gte-small')` is for compatibility purposes only.

> [!WARNING]
> Docs for Session class will end here - There's a open [PR #539](https://github.com/supabase/edge-runtime/pull/539) that may change a lot of things for it.
10 changes: 9 additions & 1 deletion ext/node/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
# deno_node
# Supabase Node module

This crate is part of the Supabase Edge Runtime stack and implements NodeJs
related features.

To see all compatible features, please check the
[NodeJs Polyfills](/ext/node/polyfills/README.md) section.

## deno_node

`require` and other node related functionality for Deno.
7 changes: 4 additions & 3 deletions ext/node/polyfills/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Deno Node.js compatibility
# Supabase Node.js compatibility module

This module is meant to have a compatibility layer for the
This crate is part of the Supabase Edge Runtime stack and implements a
compatibility layer for the
[Node.js standard library](https://nodejs.org/docs/latest/api/).

**Warning**: Any function of this module should not be referred anywhere in the
Expand Down Expand Up @@ -59,7 +60,7 @@ Deno standard library as it's a compatibility module.
- [x] worker_threads
- [ ] zlib

* [x] node globals _partly_
- [x] node globals _partly_

### Deprecated

Expand Down