Skip to content

Releases: ServeurpersoCom/llama.cpp

b6670

02 Oct 14:20
91a2a56

Choose a tag to compare

musa: update compile flags (#16265)

Signed-off-by: Xiaodong Ye <[email protected]>

b6668

02 Oct 09:27
f09aefa

Choose a tag to compare

ci: update vulkan ci (#16294)

b6660

01 Oct 18:34
4201dea

Choose a tag to compare

common: introduce http.h for httplib-based client (#16373)

* common: introduce http.h for httplib-based client

This change moves cpp-httplib based URL parsing and client setup into
a new header `common/http.h`, and integrates it in `arg.cpp` and `run.cpp`.

It is an iteration towards removing libcurl, while intentionally
minimizing changes to existing code to guarantee the same behavior when
`LLAMA_CURL` is used.

Signed-off-by: Adrien Gallouët <[email protected]>

* tools : add missing WIN32_LEAN_AND_MEAN

Signed-off-by: Adrien Gallouët <[email protected]>

---------

Signed-off-by: Adrien Gallouët <[email protected]>
Signed-off-by: Adrien Gallouët <[email protected]>

b6653

01 Oct 03:56
e74c92e

Choose a tag to compare

model : support GLM 4.6 (make a few NextN/MTP tensors not required) (…

b6651

30 Sep 18:39
bf6f3b3

Choose a tag to compare

common : disable progress bar without a tty (#16352)

* common : disable progress bar without a tty

Signed-off-by: Adrien Gallouët <[email protected]>

* Add missing headers

Signed-off-by: Adrien Gallouët <[email protected]>

---------

Signed-off-by: Adrien Gallouët <[email protected]>

b6648

30 Sep 17:49
8d78cd2

Choose a tag to compare

ggml webgpu: support for rope,div,sub,glu,scale,cont operators (#16187)

* Work on rope

* Simplify inplace operation generation and combine mul/add generation

* Work on rope variants

* implement neox rope

* rope complete

* Add sub,div,glu operators

* implement scale op

* Update cpy shader to handle cont/more types

* formatting

* Update test vars printing for rope,rms_norm

* Avoid ROPE hardcoded constants

* Add TODO to change ROPE constants to enum

Co-authored-by: Georgi Gerganov <[email protected]>

* fix TODO comment

---------

Co-authored-by: Georgi Gerganov <[email protected]>

b6646

30 Sep 16:30
364a7a6

Choose a tag to compare

common : remove common_has_curl() (#16351)

`test-arg-parser.cpp` has been updated to work consistently,
regardless of whether CURL or SSL support is available, and
now always points to `ggml.ai`.

The previous timeout test has been removed, but it can be
added back by providing a dedicated URL under `ggml.ai`.

Signed-off-by: Adrien Gallouët <[email protected]>

b6644

30 Sep 11:23

Choose a tag to compare

ggml : bump version to 0.9.4 (ggml/1363)

b6643

30 Sep 08:52
a014310

Choose a tag to compare

cuda : Enable CUDA Graph usage for Nemotron Nano v2 (NemotronH) (#16328)

* Fix Nemotron Nano v2 9B not executing as CUDA Graph on NVIDIA GPUs

* fix to ensure test-backend-ops check passes

b6639

30 Sep 06:44
de41f2b

Choose a tag to compare

codeowners: add codeowners for opencl backend (#16344)