Develop ET Server Binary + Distribute Prebuilt Releases via GH?

### 🚀 The feature, motivation and pitch

I've been dabbling around with integrating local models into desktop apps and one thing that I found quite handy for development was being able to fetch prebuilt binaries for llama.cpp from their [releases page](https://github.com/ggml-org/llama.cpp/releases). These could then easily be dropped into developing a desktop application and spawned from the main process of whatever framework is in use (e.g. Electron, Wails, Tauri). Running the `llama-server` binary then allows application developers to utilise local LMs with simple API calls.

In a similar vein, I was wondering whether it would be possible to do two things:

1. Design an "ET server" binary which when run with any `.pte` file, will essentially serve the model for inference using the ET runtime (perhaps it also already exists and I may have not found it). The server could be spawned using something like `et-server --model my_model.pte --port 8080`.

2. Distribute `et-server` as a precompiled binary for different targets (windows, mac, linux, etc) via GH Releases or another channel. I think it could make it easier for developers to bundle and distribute et-server in their desktop applications without needing to build from source.

I can see how the first point might be particularly challenging, especially since nature of inputs can vary depending on the model + its task. Perhaps it could be better suited for torchchat.

Either way, just some thoughts and would love to know if there are better ways to handle this pain point of bringing the ET runtime closer to app dev! Thanks for reading :)

cc @mergennachin @iseeyuan @lucylq @helunwencser @tarun292 @kimishpatel @jackzhxng

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Develop ET Server Binary + Distribute Prebuilt Releases via GH? #10888

🚀 The feature, motivation and pitch

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Develop ET Server Binary + Distribute Prebuilt Releases via GH? #10888

Description

🚀 The feature, motivation and pitch

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions