Releases: ggerganov/llama.cpp
Releases · ggerganov/llama.cpp
b4659
ggml : optimize and build warning fix for LoongArch (#11709) * ggml : optimize convert f32<->f16 for loongarch_asx * ggml : optimize loongarch_asx extend i16,i8,u8 to i32,i16 * ggml : Fix warnings when run cpu CI locally on LoongArch
b4658
llama : fix old glm4 models (#11670)
b4657
sync : ggml
b4651
build : fix llama.pc (#11658) Signed-off-by: Adrien Gallouët <[email protected]>
b4649
vulkan: optimize coopmat2 iq2/iq3 callbacks (#11521) * vulkan: optimize coopmat2 iq2/iq3 callbacks * build: trigger CI on GLSL compute shader changes
b4648
vulkan: initial support for IQ4_XS quantization (#11501)
b4647
vulkan: use smaller combined allocations to avoid fragmentation (#11551)
b4646
metal : avoid breaking build when metal API predates TARGET_OS_VISION…
b4644
metal : adjust support conditions for norm operators (#11671) cont #11659 ggml-ci
b4643
CUDA: support for mat. mul. with ne03 != ne13 (#11656)