Releases: CodeLinaro/llama.cpp
Releases · CodeLinaro/llama.cpp
b3841
common : ensure llama_batch size does not exceed max size (#9668) A crash was observed when the number of tokens added to a batch exceeds llama_batch size. An assertion in llama_batch_add was added to protect against llama_batch size overflow.
b3828
[SYCL] add missed dll file in package (#9577) * update oneapi to 2024.2 * use 2024.1 --------- Co-authored-by: arthw <[email protected]>
b3826
ci : fix docker build number and tag name (#9638) * ci : fix docker build number and tag name * fine-grant permissions
b3821
ggml : add AVX512DQ requirement for AVX512 builds (#9622)
b3814
threads: fix msvc build without openmp (#9615) We're missing atomic_thread_fence() in MSVC builds when openmp is disabled.
b3810
readme : add programmable prompt engine language CLI (#9599)
b3805
Revert "[SYCL] fallback mmvq (#9088)" (#9579) This reverts commit 50addec9a532a6518146ab837a85504850627316.
b3799
ggml-alloc : fix list of allocated tensors with GGML_ALLOCATOR_DEBUG …
b3798
Update CUDA graph on scale change plus clear nodes/params (#9550) * Avoid using saved CUDA graph if scale changes and reset nodes/params on update Fixes https://github.com/ggerganov/llama.cpp/issues/9451 * clear before resize
b3796
quantize : improve type name parsing (#9570) quantize : do not ignore invalid types in arg parsing quantize : ignore case of type and ftype arguments