Skip to content

Support TO RUN on SQLFlow #2161

Open
Open
@brightcoder01

Description

@brightcoder01

SQLFlow describes an end-to-end machine learning pipeline. Data transformation is an important part in the entire process.

Please check the following example SQL statement:

SELECT * FROM {source_table}
TO RUN {function_name}
WITH
    param_a = value_a,
    param_b = value_b
INTO {result_table}

{function_name} is the name of data transformation function. It can be either a built-in function from SQLFlow or the customized function provided by the users. We will support built-in function at the first step. TSFresh is our first built-in function.

{source_table} is the name of the input table from which the transform function above read the data.
{result_table} is the name of the output table into which the transform function above will write the processed result.

The design doc

link.

Task break down

  • Upgrade parser to support TO RUN statement

  • Translate TO RUN to a workflow

  • Upgrade goalisa to submit PyODPS task. Enable submitting ODPS SQL and PyODPS task on the deployment of Dataworks.

  • sqlflow.runner module.

  • TSFresh high level api implementation and docker image.

  • TO RUN function repo sample.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions