unsure if is doable in the docker model runner context but
if anyone is keen on experimenting with mlx-flash and flash-moe to bring something like this to the runner
https://www.reddit.com/r/LocalLLaMA/comments/1s0mqto/has_anyone_tried_this_flashmoe_running_a_397b/
unsure if is doable in the docker model runner context but
if anyone is keen on experimenting with mlx-flash and flash-moe to bring something like this to the runner
https://www.reddit.com/r/LocalLLaMA/comments/1s0mqto/has_anyone_tried_this_flashmoe_running_a_397b/