Skip to content

Commit 465137c

Browse files
authored
feat: vector distance functions (#2176)
* add: vector-distance-functions * add: vector-distance-functions index
1 parent 257a60d commit 465137c

File tree

5 files changed

+104
-2
lines changed

5 files changed

+104
-2
lines changed

docs/en/guides/51-ai-functions/index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ Databend provides built-in AI functions for various natural language processing
4242

4343
- [ai_embedding_vector](/sql/sql-functions/ai-functions/ai-embedding-vector): Generates embeddings for text documents.
4444
- [ai_text_completion](/sql/sql-functions/ai-functions/ai-text-completion): Generates text completions based on a given prompt.
45-
- [cosine_distance](/sql/sql-functions/ai-functions/ai-cosine-distance): Calculates the cosine distance between two embeddings.
45+
- [cosine_distance](/sql/sql-functions/vector-distance-functions/vector-cosine-distance): Calculates the cosine distance between two embeddings.
4646

4747
## Generating Embeddings
4848

@@ -67,7 +67,7 @@ VALUES
6767

6868
## Calculating Cosine Distance
6969

70-
Now, let's find the documents that are most similar to a given query using the [cosine_distance](/sql/sql-functions/ai-functions/ai-cosine-distance) function:
70+
Now, let's find the documents that are most similar to a given query using the [cosine_distance](/sql/sql-functions/vector-distance-functions/vector-cosine-distance) function:
7171
```sql
7272
SELECT
7373
id,
Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
---
2+
title: 'L2_DISTANCE'
3+
description: 'Measuring Euclidean distance between vectors in Databend'
4+
---
5+
6+
Calculates the Euclidean (L2) distance between two vectors, measuring the straight-line distance between them in vector space.
7+
8+
## Syntax
9+
10+
```sql
11+
L2_DISTANCE(vector1, vector2)
12+
```
13+
14+
## Arguments
15+
16+
- `vector1`: First vector (ARRAY(FLOAT32 NOT NULL))
17+
- `vector2`: Second vector (ARRAY(FLOAT32 NOT NULL))
18+
19+
## Returns
20+
21+
Returns a FLOAT value representing the Euclidean (L2) distance between the two vectors. The value is always non-negative:
22+
- 0: Identical vectors
23+
- Larger values: Vectors that are farther apart
24+
25+
## Description
26+
27+
The L2 distance, also known as Euclidean distance, measures the straight-line distance between two points in Euclidean space. It is one of the most common metrics used in vector similarity search and machine learning applications.
28+
29+
The function:
30+
31+
1. Verifies that both input vectors have the same length
32+
2. Computes the sum of squared differences between corresponding elements
33+
3. Returns the square root of this sum
34+
35+
The mathematical formula implemented is:
36+
37+
```
38+
L2_distance(v1, v2) = √(Σ(v1ᵢ - v2ᵢ)²)
39+
```
40+
41+
Where v1ᵢ and v2ᵢ are the elements of the input vectors.
42+
43+
:::info
44+
- This function performs vector computations within Databend and does not rely on external APIs.
45+
:::
46+
47+
## Examples
48+
49+
Create a table with vector data:
50+
51+
```sql
52+
CREATE OR REPLACE TABLE vectors (
53+
id INT,
54+
vec ARRAY(FLOAT32 NOT NULL)
55+
);
56+
57+
INSERT INTO vectors VALUES
58+
(1, [1.0000, 2.0000, 3.0000]),
59+
(2, [1.0000, 2.2000, 3.0000]),
60+
(3, [4.0000, 5.0000, 6.0000]);
61+
```
62+
63+
Find the vector closest to [1, 2, 3] using L2 distance:
64+
65+
```sql
66+
SELECT
67+
id,
68+
vec,
69+
L2_DISTANCE(vec, [1.0000, 2.0000, 3.0000]) AS distance
70+
FROM
71+
vectors
72+
ORDER BY
73+
distance ASC;
74+
```
75+
76+
```
77+
+----+-------------------------+----------+
78+
| id | vec | distance |
79+
+----+-------------------------+----------+
80+
| 1 | [1.0000,2.0000,3.0000] | 0.0 |
81+
| 2 | [1.0000,2.2000,3.0000] | 0.2 |
82+
| 3 | [4.0000,5.0000,6.0000] | 5.196152 |
83+
+----+-------------------------+----------+
84+
```
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
{
2+
"label": "Vector Distance Functions"
3+
}
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
title: 'Vector Distance Functions'
3+
description: 'Vector distance functions in Databend for similarity measurement'
4+
---
5+
6+
7+
# Vector Distance Functions
8+
9+
Databend provides functions for measuring distance or similarity between vectors, essential for vector search and machine learning applications.
10+
11+
## Function Comparison
12+
| Function | Description | Range | Best For | Use Cases |
13+
|----------|-------------|-------|----------|-----------|
14+
| [L2_DISTANCE](01-vector-l2-distance.md) | Euclidean (straight-line) distance | [0, ∞) | When magnitude matters | • Image similarity<br/>• Geographical data<br/>• Anomaly detection<br/>• Feature-based clustering |
15+
| [COSINE_DISTANCE](00-vector-cosine-distance.md) | Angular distance between vectors | [0, 1] | When direction matters more than magnitude | • Document similarity<br/>• Semantic search<br/>• Recommendation systems<br/>• Text analysis |

0 commit comments

Comments
 (0)