Releases: pwilkin/llama.cpp
Releases · pwilkin/llama.cpp
b6710
b6688
rpc : add support for multiple devices (#16276) * rpc : add support for multiple devices Allow rpc-server to expose multiple devices from a single endpoint. Change RPC protocol to include device identifier where needed. closes: #15210 * fixes * use ggml_backend_reg_t * address review comments * fix llama-bench backend report * address review comments, change device naming * fix cmd order
b6586
model : add GroveMoE support (#15510) * add GroveMoE support * remove constexpr that fails on certain compilers * revert crude scalar div implementation, use cast * build_attn_inp_kv_unified -> build_attn_inp_kv * fix build_attn * re-apply ffn_exps regex changes
b6585
vendors: update miniaudio version (#16212) * vendor: update miniaudio.h Signed-off-by: Aaron Teo <[email protected]> * vendor: update miniaudio.h Signed-off-by: Aaron Teo <[email protected]> --------- Signed-off-by: Aaron Teo <[email protected]>
b6497
common : Fix corrupted memory error on json grammar initialization (#…
b6381
CANN: Refactor ND to NZ workspace to be per-device (#15763) * CANN:Refactor ND to NZ workspace to be per-device in Ascend backend - Replaced the previous single global ND→NZ workspace with a per-device cache using unordered_map keyed by device ID. - Functions `release_nz_workspace`, `relloc_nz_workspace`, and `get_nz_workspace` now manage workspace independently for each device, preventing memory conflicts in multi-device / pipeline parallel scenarios. - This change fixes potential precision issues caused by workspace overwrites when multiple devices perform ND→NZ conversions concurrently. Co-authored-by: hipudding <[email protected]> * refactor Signed-off-by: noemotiovon <[email protected]> * rename Signed-off-by: noemotiovon <[email protected]> * fix review comments Signed-off-by: noemotiovon <[email protected]> --------- Signed-off-by: noemotiovon <[email protected]> Co-authored-by: hipudding <[email protected]>
b6360
fix: resolve unsigned int initialization warning for n_dims/size in g…
b6319
Merge branch 'ggml-org:master' into master
b6247
Merge branch 'ggml-org:master' into master
b5949
llama : fix `--reverse-prompt` crashing issue (#14794) Signed-off-by: Molly Sophia <[email protected]>