Open
Description
A feature request;
Implement the WASI-NN interface for CUDA so any WASM32-WASI [p1, p2 and future p3] runtime which wants a neural networking CUDA host can depend on a super efficient Rust implementation without needing to download a Docker image with gigabytes of other code.
Why?
At the moment running AI code requires a Docker image with Python and CUDA which easily goes between 5GB o 15GB which for a serverless or edge use case is ridiculous.