Closed
Description
Seeing as this is being built from the ground up, I was wondering if its possible to implement something similar to ggml-org/llama.cpp#3228
Where it's natively possible to have parallel inference.
Metadata
Metadata
Assignees
Labels
No labels