Skip to content

Releases: CodeLinaro/llama.cpp

b4255

04 Dec 00:00
cc98896
Compare
Choose a tag to compare
vulkan: optimize and reenable split_k (#10637)

Use vector loads when possible in mul_mat_split_k_reduce. Use split_k
when there aren't enough workgroups to fill the shaders.

b4242

03 Dec 00:51
642330a
Compare
Choose a tag to compare
llama : add enum for built-in chat templates (#10623)

* llama : add enum for supported chat templates

* use "built-in" instead of "supported"

* arg: print list of built-in templates

* fix test

* update server README

b4226

30 Nov 00:13
7cc2d2c
Compare
Choose a tag to compare
ggml : move AMX to the CPU backend (#10570)

* ggml : move AMX to the CPU backend

---------

Co-authored-by: Georgi Gerganov <[email protected]>

b4224

29 Nov 19:42
3a8e9af
Compare
Choose a tag to compare
imatrix : support combine-only (#10492)

* imatrix-combine-only idea

* ensured that behavior consistent with log

b4215

28 Nov 20:28
dc22344
Compare
Choose a tag to compare
ggml : remove redundant copyright notice + update authors

b4202

28 Nov 00:22
9f91251
Compare
Choose a tag to compare
common : fix duplicated file name with hf_repo and hf_file (#10550)

b4191

26 Nov 23:28
c9b00a7
Compare
Choose a tag to compare
ci : fix cuda releases (#10532)

b4174

26 Nov 04:04
0eb4e12
Compare
Choose a tag to compare
vulkan: Fix a vulkan-shaders-gen arugment parsing error (#10484)

The vulkan-shaders-gen was not parsing the --no-clean argument correctly.
Because the previous code was parsing the arguments which have a value only
and the --no-clean argument does not have a value, it was not being parsed
correctly. This commit can now correctly parse arguments that don't have values.

b4173

25 Nov 23:57
0cc6375
Compare
Choose a tag to compare
Introduce llama-run (#10291)

It's like simple-chat but it uses smart pointers to avoid manual
memory cleanups. Less memory leaks in the code now. Avoid printing
multiple dots. Split code into smaller functions. Uses no exception
handling.

Signed-off-by: Eric Curtin <[email protected]>

b4170

25 Nov 21:25
47f931c
Compare
Choose a tag to compare
server : enable cache_prompt by default (#10501)

ggml-ci