Releases · ggerganov/llama.cpp

07 Feb 08:11

225bbbf

b4659

ggml : optimize and build warning fix for LoongArch (#11709)

* ggml : optimize convert f32<->f16 for loongarch_asx

* ggml : optimize loongarch_asx extend i16,i8,u8 to i32,i16

* ggml : Fix warnings when run cpu CI locally on LoongArch

Assets 23

06 Feb 22:22

github-actions

b4658

855cd07

b4658

llama : fix old glm4 models (#11670)

Assets 23

06 Feb 20:13

github-actions

b4657

8a59053

b4657

sync : ggml

Assets 23

06 Feb 12:09

github-actions

b4651

c0d4843

b4651

build : fix llama.pc (#11658)

Signed-off-by: Adrien Gallouët <[email protected]>

Assets 23

06 Feb 06:54

github-actions

b4649

2c6c8df

b4649

vulkan: optimize coopmat2 iq2/iq3 callbacks (#11521)

* vulkan: optimize coopmat2 iq2/iq3 callbacks

* build: trigger CI on GLSL compute shader changes

Assets 23

06 Feb 06:55

github-actions

b4648

8a7e3bf

b4648

vulkan: initial support for IQ4_XS quantization (#11501)

Assets 23

06 Feb 06:36

github-actions

b4647

1b598b3

b4647

vulkan: use smaller combined allocations to avoid fragmentation (#11551)

Assets 23

06 Feb 02:22

github-actions

b4646

902368a

b4646

metal : avoid breaking build when metal API predates TARGET_OS_VISION…

Assets 23

05 Feb 09:37

github-actions

b4644

d774ab3

b4644

metal : adjust support conditions for norm operators (#11671)

cont #11659

ggml-ci

Assets 23

05 Feb 08:59

github-actions

b4643

fa62da9

b4643

CUDA: support for mat. mul. with ne03 != ne13 (#11656)

Assets 23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: ggerganov/llama.cpp

b4659

b4658

b4657

b4651

b4649

b4648

b4647

b4646

b4644

b4643