Skip to content

Releases: CodeLinaro/llama.cpp

b3799

21 Sep 17:43
d09770c
Compare
Choose a tag to compare
ggml-alloc : fix list of allocated tensors with GGML_ALLOCATOR_DEBUG …

b3798

21 Sep 05:37
41f4778
Compare
Choose a tag to compare
Update CUDA graph on scale change plus clear nodes/params  (#9550)

* Avoid using saved CUDA graph if scale changes and reset nodes/params on update

Fixes https://github.com/ggerganov/llama.cpp/issues/9451

* clear before resize

b3796

20 Sep 22:42
6335114
Compare
Choose a tag to compare
quantize : improve type name parsing (#9570)

quantize : do not ignore invalid types in arg parsing

quantize : ignore case of type and ftype arguments

b3795

20 Sep 19:21
Compare
Choose a tag to compare
ggml : fix builds (#0)

ggml-ci

b3790

20 Sep 18:46
5cb12f6
Compare
Choose a tag to compare
CUDA: fix sum.cu compilation for CUDA < 11.7 (#9562)

b3787

20 Sep 00:32
6026da5
Compare
Choose a tag to compare
server : clean-up completed tasks from waiting list (#9531)

ggml-ci

b3785

18 Sep 18:22
64c6af3
Compare
Choose a tag to compare
ggml : fix n_threads_cur initialization with one thread (#9538)

* ggml : fix n_threads_cur initialization with one thread

* Update ggml/src/ggml.c

---------

Co-authored-by: Max Krasnyansky <[email protected]>

b3772

16 Sep 18:46
23e0d70
Compare
Choose a tag to compare
ggml : move common CPU backend impl to new header (#9509)

b3749

13 Sep 05:18
bd35cb0
Compare
Choose a tag to compare
feat: remove a sampler from a chain (#9445)

* feat: remove a sampler from a chain

* fix: return removed sampler

* fix: safer casting

b3733

11 Sep 18:05
1b28061
Compare
Choose a tag to compare
llama : skip token bounds check when evaluating embeddings (#9437)