Skip to content

feat: add user defined table function support #1113

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
May 18, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -91,9 +91,9 @@ jobs:

- name: FFI unit tests
run: |
cd examples/ffi-table-provider
cd examples/datafusion-ffi-example
uv run --no-project maturin develop --uv
uv run --no-project pytest python/tests/_test_table_provider.py
uv run --no-project pytest python/tests/_test*.py

- name: Cache the generated dataset
id: cache-tpch-dataset
Expand Down
32 changes: 32 additions & 0 deletions docs/source/user-guide/common-operations/udf-and-udfa.rst
Original file line number Diff line number Diff line change
Expand Up @@ -242,3 +242,35 @@ determine which evaluate functions are called.
})

df.select("a", exp_smooth(col("a")).alias("smooth_a")).show()

Table Functions
---------------

User Defined Table Functions are slightly different than the other functions
described here. These functions take any number of `Expr` arguments, but only
literal expressions are supported. Table functions must return a Table
Provider as described in the ref:`_io_custom_table_provider` page.

Once you have a table function, you can register it with the session context
by using :py:func:`datafusion.context.SessionContext.register_udtf`.

There are examples of both rust backed and python based table functions in the
examples folder of the repository. If you have a rust backed table function
that you wish to expose via PyO3, you need to expose it as a ``PyCapsule``.

.. code-block:: rust

#[pymethods]
impl MyTableFunction {
fn __datafusion_table_function__<'py>(
&self,
py: Python<'py>,
) -> PyResult<Bound<'py, PyCapsule>> {
let name = cr"datafusion_table_function".into();

let func = self.clone();
let provider = FFI_TableFunction::new(Arc::new(func), None);

PyCapsule::new(py, provider, Some(name))
}
}
Loading