[Investigation] PyTorch Model Serving Solution

SQLFlow extends the SQL syntax to describe the end-to-end machine learning pipeline.
The end-to-end solution includes the model serving. The data transformation logic is consistent between training and serving stage. The PyTorch preprocessing is concluded in #2276 .

### [TorchServe](https://github.com/pytorch/serve)

[TorchServe](https://github.com/pytorch/serve) is the official serving framework from PyTorch Repo.
The key component representing the serving logic is [TorchServe handlers](https://github.com/pytorch/serve/tree/master/ts/torch_handler). All the handlers are Python classes. The handler class contains the methods `preprocess`, `inference` and `postprocess`. TorchServe provides some built-in handlers, and we can customize and contribute the handlers of our own.

Questions:
- **Is there any performance issue since all the handlers are written in python? Need performance testing/profiling.**

### [LibTorch](https://pytorch.org/tutorials/beginner/Intro_to_TorchScript_tutorial.html)

Convert PyTorch model to TorchScript using `torch.jit.trace` or `torch.jit.ScriptModule`. And then we can load the TorchScript using LibTorch (C++). Build our own ModelServer with LibTorch. Besides the model inference inside LibTorch, we need add preprocessing, postprocessing, RPC service, model auto update, model instance management and so on.

Question:
- **The TorchScript only contains the main model structure. It doesn't contain the preprocessing and postprocessing logic.**
- **Do we need build our model server with LibTorch from scratch? The server need contains preprocess, postprocess, RPC server and other features mentioned above.**

### [ONNX](https://pytorch.org/tutorials/advanced/super_resolution_with_onnxruntime.html#)

Convert PyTorch model to ONNX format using the API `torch.onnx.export(torch_model, ...)`.
And then we can serve the ONNX model using [ONNXRuntime](https://github.com/microsoft/onnxruntime).

Preprocessing: ONNXRuntime provides [featurizers_ops](https://github.com/microsoft/onnxruntime/tree/master/onnxruntime/featurizers_ops/cpu) and we can use it in serving stage. Can we leverage them in training stage?
*- Proposal: We can use the featurizers_ops as a separate package to process the data before feeding it into the model. At the stage of exporting model for serving, we can convert it to ONNX format (featurizers_ops + ONNX ops in Model). Issue to be filed to confirm with ONNX team about using featurizers_ops as a separate package.*

Question:
- **The ONNX implements [the common used operators](https://github.com/onnx/onnx/blob/master/docs/Operators.md) but doesn't guarantee to cover all the OPs in PyTorch.**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Investigation] PyTorch Model Serving Solution #2399

TorchServe

LibTorch

ONNX

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Investigation] PyTorch Model Serving Solution #2399

Description

TorchServe

LibTorch

ONNX

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions