Releases · CodeLinaro/llama.cpp

06 Feb 07:28

2c6c8df

b4649

vulkan: optimize coopmat2 iq2/iq3 callbacks (#11521)

* vulkan: optimize coopmat2 iq2/iq3 callbacks

* build: trigger CI on GLSL compute shader changes

Assets 23

28 Jan 00:32

github-actions

b4568

a4417dd

b4568

Add new hf protocol for ollama (#11449)

https://huggingface.co/docs/hub/en/ollama

Signed-off-by: Eric Curtin <[email protected]>

Assets 23

22 Jan 05:52

github-actions

b4524

6171c9d

b4524

Add Jinja template support (#11016)

* Copy minja from https://github.com/google/minja/commit/58f0ca6dd74bcbfbd4e71229736640322b31c7f9

* Add --jinja and --chat-template-file flags

* Add missing <optional> include

* Avoid print in get_hf_chat_template.py

* No designated initializers yet

* Try and work around msvc++ non-macro max resolution quirk

* Update test_chat_completion.py

* Wire LLM_KV_TOKENIZER_CHAT_TEMPLATE_N in llama_model_chat_template

* Refactor test-chat-template

* Test templates w/ minja

* Fix deprecation

* Add --jinja to llama-run

* Update common_chat_format_example to use minja template wrapper

* Test chat_template in e2e test

* Update utils.py

* Update test_chat_completion.py

* Update run.cpp

* Update arg.cpp

* Refactor common_chat_* functions to accept minja template + use_jinja option

* Attempt to fix linkage of LLAMA_CHATML_TEMPLATE

* Revert LLAMA_CHATML_TEMPLATE refactor

* Normalize newlines in test-chat-templates for windows tests

* Forward decl minja::chat_template to avoid eager json dep

* Flush stdout in chat template before potential crash

* Fix copy elision warning

* Rm unused optional include

* Add missing optional include to server.cpp

* Disable jinja test that has a cryptic windows failure

* minja: fix vigogne (https://github.com/google/minja/pull/22)

* Apply suggestions from code review

Co-authored-by: Xuan Son Nguyen <[email protected]>
Co-authored-by: Georgi Gerganov <[email protected]>

* Finish suggested renamings

* Move chat_templates inside server_context + remove mutex

* Update --chat-template-file w/ recent change to --chat-template

* Refactor chat template validation

* Guard against missing eos/bos tokens (null token otherwise throws in llama_vocab::impl::token_get_attr)

* Warn against missing eos / bos tokens when jinja template references them

* rename: common_chat_template[s]

* reinstate assert on chat_templates.template_default

* Update minja to https://github.com/google/minja/commit/b8437df626ac6cd0ce3b333b3c74ed1129c19f25

* Update minja to https://github.com/google/minja/pull/25

* Update minja from https://github.com/google/minja/pull/27

* rm unused optional header

---------

Co-authored-by: Xuan Son Nguyen <[email protected]>
Co-authored-by: Georgi Gerganov <[email protected]>

Assets 23

09 Jan 05:26

github-actions

b4450

8d59d91

b4450

fix: add missing msg in static_assert (#11143)

Signed-off-by: hydai <[email protected]>

Assets 23

23 Dec 10:17

github-actions

b4382

86bf31c

b4382

rpc-server : add support for the SYCL backend (#10934)

Assets 23

13 Dec 19:55

github-actions

b4324

c27ac67

b4324

Opt class for positional argument handling (#10508)

Added support for positional arguments `model` and `prompt`. Added
functionality to download via strings like:

  llama-run llama3
  llama-run ollama://granite-code
  llama-run ollama://granite-code:8b
  llama-run hf://QuantFactory/SmolLM-135M-GGUF/SmolLM-135M.Q2_K.gguf
  llama-run huggingface://bartowski/SmolLM-1.7B-Instruct-v0.2-GGUF/SmolLM-1.7B-Instruct-v0.2-IQ3_M.gguf
  llama-run https://example.com/some-file1.gguf
  llama-run some-file2.gguf
  llama-run file://some-file3.gguf

Signed-off-by: Eric Curtin <[email protected]>

Assets 22

11 Dec 09:02

github-actions

b4302

43041d2

b4302

ggml: load all backends from a user-provided search path (#10699)

* feat: load all backends from a user-provided search path

* fix: Windows search path

* refactor: rename `ggml_backend_load_all_in_search_path` to `ggml_backend_load_all_from_path`

* refactor: rename `search_path` to `dir_path`

* fix: change `NULL` to `nullptr`

Co-authored-by: Diego Devesa <[email protected]>

* fix: change `NULL` to `nullptr`

---------

Co-authored-by: Diego Devesa <[email protected]>

Assets 22

10 Dec 21:36

github-actions

b4301

b685daf

b4301

vulkan: request round-to-even for fp16 in im2col/rope_head (#10767)

Vulkan doesn't mandate a specific rounding mode, but the shader_float_controls
feature allows rounding mode to be requested if the implementation supports it.

Assets 22

09 Dec 05:25

github-actions

b4291

ce8784b

b4291

server : fix format_infill (#10724)

* server : fix format_infill

* fix

* rename

* update test

* use another model

* update test

* update test

* test_invalid_input_extra_req

Assets 22

05 Dec 01:16

github-actions

b4267

f112d19

b4267

Update deprecation-warning.cpp (#10619)

Fixed Path Separator Handling for Cross-Platform Support (Windows File Systems)

Assets 22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: CodeLinaro/llama.cpp

b4649

Uh oh!

b4568

Uh oh!

b4524

Uh oh!

b4450

Uh oh!

b4382

Uh oh!

b4324

Uh oh!

b4302

Uh oh!

b4301

Uh oh!

b4291

Uh oh!

b4267

Uh oh!