Releases: CodeLinaro/llama.cpp
Releases · CodeLinaro/llama.cpp
b3796
quantize : improve type name parsing (#9570) quantize : do not ignore invalid types in arg parsing quantize : ignore case of type and ftype arguments
b3795
ggml : fix builds (#0) ggml-ci
b3790
CUDA: fix sum.cu compilation for CUDA < 11.7 (#9562)
b3787
server : clean-up completed tasks from waiting list (#9531) ggml-ci
b3785
ggml : fix n_threads_cur initialization with one thread (#9538) * ggml : fix n_threads_cur initialization with one thread * Update ggml/src/ggml.c --------- Co-authored-by: Max Krasnyansky <[email protected]>
b3772
ggml : move common CPU backend impl to new header (#9509)
b3749
feat: remove a sampler from a chain (#9445) * feat: remove a sampler from a chain * fix: return removed sampler * fix: safer casting
b3733
llama : skip token bounds check when evaluating embeddings (#9437)
b3713
llama : minor sampling refactor (2) (#9386)
b3646
Correct typo run_llama2.sh > run-llama2.sh (#9149)