Releases · ggerganov/llama.cpp

04 Feb 01:00

cde3833

b4628 Latest

Latest

`tool-call`: allow `--chat-template chatml` w/ `--jinja`, default to …

Assets 23

cudart-llama-bin-win-cu11.7-x64.zip

303 MB 2025-02-04T01:00:43Z
cudart-llama-bin-win-cu12.4-x64.zip

373 MB 2025-02-04T01:00:55Z
llama-b4628-bin-macos-arm64.zip

25.3 MB 2025-02-04T01:01:08Z
llama-b4628-bin-macos-x64.zip

27.1 MB 2025-02-04T01:01:10Z
llama-b4628-bin-ubuntu-x64.zip

29 MB 2025-02-04T01:01:11Z
llama-b4628-bin-win-avx-x64.zip

15.4 MB 2025-02-04T01:01:13Z
llama-b4628-bin-win-avx2-x64.zip

15.4 MB 2025-02-04T01:01:14Z
llama-b4628-bin-win-avx512-x64.zip

15.4 MB 2025-02-04T01:01:15Z
llama-b4628-bin-win-cuda-cu11.7-x64.zip

150 MB 2025-02-04T01:01:16Z
llama-b4628-bin-win-cuda-cu12.4-x64.zip

150 MB 2025-02-04T01:01:22Z
Source code (zip)

2025-02-03T23:49:27Z
Source code (tar.gz)

2025-02-03T23:49:27Z

03 Feb 12:57

github-actions

b4623

21c84b5

b4623

CUDA: fix Volta FlashAttention logic (#11615)

Assets 23

02 Feb 23:22

github-actions

b4621

6eecde3

b4621

HIP: fix flash_attn_stream_k_fixup warning (#11604)

Assets 23

02 Feb 22:24

github-actions

b4620

396856b

b4620

CUDA/HIP: add support for selectable warp size to mmv (#11519)

CUDA/HIP: add support for selectable warp size to mmv

Assets 23

02 Feb 21:55

github-actions

b4619

4d0598e

b4619

HIP: add GGML_CUDA_CC_IS_* for amd familys as increasing cc archtectu…

Assets 23

02 Feb 20:57

github-actions

b4618

90f9b88

b4618

nit: more informative crash when grammar sampler fails (#11593)

Assets 23

02 Feb 19:12

github-actions

b4617

864a0b6

b4617

CUDA: use mma PTX instructions for FlashAttention (#11583)

* CUDA: use mma PTX instructions for FlashAttention

* __shfl_sync workaround for movmatrix

* add __shfl_sync to HIP

Co-authored-by: Diego Devesa <[email protected]>

Assets 23

02 Feb 15:56

github-actions

b4616

84ec8a5

b4616

Name colors (#11573)

It's more descriptive, use #define's so we can use compile-time
concatenations.

Signed-off-by: Eric Curtin <[email protected]>

Assets 23

02 Feb 10:26

github-actions

b4615

bfcce4d

b4615

`tool-call`: support Command R7B (+ return tool_plan "thoughts" in AP…

Assets 23

02 Feb 10:11

github-actions

b4614

6980448

b4614

Fix exotic ci env that lacks ostringstream::str (#11581)

Assets 23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: ggerganov/llama.cpp

b4628

b4623

b4621

b4620

b4619

b4618

b4617

b4616

b4615

b4614