Skip to content

Refactor sql package structure #1585

Open
@typhoonzero

Description

@typhoonzero

c.f. #1553 (comment)

Currently, sql package contains almost all core code for parsing, generating python code and executing. We need to put those features in a separated package structure for better code understanding:

  1. parser package is already moved under pkg folder
  2. feature_derivation package is already moved under pkg folder
  3. tools like pipe, verifier are already moved under pkg folder
  4. TODO: move ir to pkg folder
  5. TODO: move testdata to pkg folder

We currently have two job execution mode: workflow mode and run in local mode.

  • workflow mode, we generate a Couler python program, then execute the python program to generate a YAML file, then call functions under pkg/argo to submit it and monitor the job's status. Each step in the workflow is a repl command.
  • repl mode, we generate some actual Python training/predicting code and execute the python code to get the result.

We may need to use a command step instead of repl to be more meaningful. For that w'll have

  1. cmd/step calls pkg/step to run a step or, in the future generate a step Python code
  2. pkg/step contains:
    1. run current step and get output (will only be used by repl if we generate python code for each step but not using the command step to run a single SQL statement)
    2. pkg/step/codegen generate step python code
  3. pkg/workflow contains:
    1. submit workflow and monitor the status
    2. pkg/workflow/codegen generate Couler/Fluid python code
    3. pkg/workflow/argo submit, get status, get logs for argo
    4. pkg/workflow/tekton submit, get status, get logs for tekton

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions