Skip to content

Develop ET Server Binary + Distribute Prebuilt Releases via GH? #10888

Open
@dillondesilva

Description

@dillondesilva

🚀 The feature, motivation and pitch

I've been dabbling around with integrating local models into desktop apps and one thing that I found quite handy for development was being able to fetch prebuilt binaries for llama.cpp from their releases page. These could then easily be dropped into developing a desktop application and spawned from the main process of whatever framework is in use (e.g. Electron, Wails, Tauri). Running the llama-server binary then allows application developers to utilise local LMs with simple API calls.

In a similar vein, I was wondering whether it would be possible to do two things:

  1. Design an "ET server" binary which when run with any .pte file, will essentially serve the model for inference using the ET runtime (perhaps it also already exists and I may have not found it). The server could be spawned using something like et-server --model my_model.pte --port 8080.

  2. Distribute et-server as a precompiled binary for different targets (windows, mac, linux, etc) via GH Releases or another channel. I think it could make it easier for developers to bundle and distribute et-server in their desktop applications without needing to build from source.

I can see how the first point might be particularly challenging, especially since nature of inputs can vary depending on the model + its task. Perhaps it could be better suited for torchchat.

Either way, just some thoughts and would love to know if there are better ways to handle this pain point of bringing the ET runtime closer to app dev! Thanks for reading :)

cc @mergennachin @iseeyuan @lucylq @helunwencser @tarun292 @kimishpatel @jackzhxng

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: examplesIssues related to demos under examples/

    Type

    No type

    Projects

    Status

    To triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions