Skip to content

Commit a03c5d9

Browse files
authored
doc: add packages & imports in udf python (#2448)
1 parent 2539105 commit a03c5d9

File tree

1 file changed

+70
-3
lines changed

1 file changed

+70
-3
lines changed

docs/en/guides/54-query/03-udf.md

Lines changed: 70 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ FROM persons;
7979

8080
## Embedded UDFs
8181

82-
Embedded UDFs allow you to write functions using full programming languages, giving you more flexibility and power than Lambda UDFs.
82+
AKA Script UDFs, Embedded UDFs allow you to write functions using full programming languages, giving you more flexibility and power than Lambda UDFs.
8383

8484
### Supported Languages
8585

@@ -93,7 +93,10 @@ Embedded UDFs allow you to write functions using full programming languages, giv
9393
```sql
9494
CREATE [OR REPLACE] FUNCTION <function_name>([<parameter_type>, ...])
9595
RETURNS <return_type>
96-
LANGUAGE <language_name> HANDLER = '<handler_name>'
96+
LANGUAGE <language_name>
97+
(IMPORTS = ("<import_path>", ...))
98+
(PACKAGES = ("<package_path>", ...))
99+
HANDLER = '<handler_name>'
97100
AS $$
98101
<function_code>
99102
$$;
@@ -107,6 +110,8 @@ $$;
107110
| `parameter_type` | Data type of each input parameter |
108111
| `return_type` | Data type of the function's return value |
109112
| `language_name` | Programming language (python or javascript) |
113+
| `imports` | List of stage files, such as `@s_udf/your_file.zip`, files will be downloaded from stage into path `sys._xoptions['databend_import_directory']`, you can read it and unzip it in your python codes |
114+
| `packages` | List of packages to be installed from pypi, such as `numpy`, `pandas` etc. |
110115
| `handler_name` | Name of the function in the code that will be called |
111116
| `function_code` | The actual code implementing the function |
112117

@@ -165,6 +170,68 @@ SELECT calculate_age_py('1990-05-15') AS age_result;
165170
-- +------------+
166171
```
167172

173+
#### Example: use imports/packages in python udf
174+
175+
```sql
176+
CREATE OR REPLACE FUNCTION package_udf()
177+
RETURNS FLOAT
178+
LANGUAGE PYTHON
179+
IMPORTS = ('@s1/a.zip')
180+
PACKAGES = ('scikit-learn')
181+
HANDLER = 'udf'
182+
AS
183+
$$
184+
from sklearn.datasets import load_iris
185+
from sklearn.model_selection import train_test_split
186+
from sklearn.ensemble import RandomForestClassifier
187+
188+
import fcntl
189+
import os
190+
import sys
191+
import threading
192+
import zipfile
193+
194+
# File lock class for synchronizing write access to /tmp.
195+
class FileLock:
196+
def __enter__(self):
197+
self._lock = threading.Lock()
198+
self._lock.acquire()
199+
self._fd = open('/tmp/lockfile.LOCK', 'w+')
200+
fcntl.lockf(self._fd, fcntl.LOCK_EX)
201+
202+
def __exit__(self, type, value, traceback):
203+
self._fd.close()
204+
self._lock.release()
205+
206+
import_dir = sys._xoptions['databend_import_directory']
207+
208+
zip_file_path = import_dir + "/a.zip"
209+
extracted = '/tmp'
210+
211+
# extract the zip to directory `/tmp/a`
212+
with FileLock():
213+
if not os.path.isdir(extracted + '/a'):
214+
with zipfile.ZipFile(zip_file_path, 'r') as myzip:
215+
myzip.extractall(extracted)
216+
217+
def udf():
218+
X, y = load_iris(return_X_y=True)
219+
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)
220+
221+
model = RandomForestClassifier()
222+
model.fit(X_train, y_train)
223+
return model.score(X_test, y_test)
224+
$$;
225+
226+
SELECT package_udf();
227+
228+
╭───────────────────╮
229+
│ package_udf() │
230+
│ Nullable(Float32) │
231+
├───────────────────┤
232+
1
233+
╰───────────────────╯
234+
```
168235

169236
### JavaScript
170237

@@ -182,7 +249,7 @@ JavaScript UDFs allow you to use modern JavaScript (ES6+) features within your S
182249
| VARCHAR | String |
183250
| BINARY | Uint8Array |
184251
| DATE/TIMESTAMP | Date |
185-
| LIST | Array |
252+
| ARRAY | Array |
186253
| MAP | Object |
187254
| STRUCT | Object |
188255
| JSON | Object/Array |

0 commit comments

Comments
 (0)