Skip to content

Releases: CodeLinaro/llama.cpp

b4450

09 Jan 05:26
8d59d91
Compare
Choose a tag to compare
fix: add missing msg in static_assert (#11143)

Signed-off-by: hydai <[email protected]>

b4382

23 Dec 10:17
86bf31c
Compare
Choose a tag to compare
rpc-server : add support for the SYCL backend (#10934)

b4324

13 Dec 19:55
c27ac67
Compare
Choose a tag to compare
Opt class for positional argument handling (#10508)

Added support for positional arguments `model` and `prompt`. Added
functionality to download via strings like:

  llama-run llama3
  llama-run ollama://granite-code
  llama-run ollama://granite-code:8b
  llama-run hf://QuantFactory/SmolLM-135M-GGUF/SmolLM-135M.Q2_K.gguf
  llama-run huggingface://bartowski/SmolLM-1.7B-Instruct-v0.2-GGUF/SmolLM-1.7B-Instruct-v0.2-IQ3_M.gguf
  llama-run https://example.com/some-file1.gguf
  llama-run some-file2.gguf
  llama-run file://some-file3.gguf

Signed-off-by: Eric Curtin <[email protected]>

b4302

11 Dec 09:02
43041d2
Compare
Choose a tag to compare
ggml: load all backends from a user-provided search path (#10699)

* feat: load all backends from a user-provided search path

* fix: Windows search path

* refactor: rename `ggml_backend_load_all_in_search_path` to `ggml_backend_load_all_from_path`

* refactor: rename `search_path` to `dir_path`

* fix: change `NULL` to `nullptr`

Co-authored-by: Diego Devesa <[email protected]>

* fix: change `NULL` to `nullptr`

---------

Co-authored-by: Diego Devesa <[email protected]>

b4301

10 Dec 21:36
b685daf
Compare
Choose a tag to compare
vulkan: request round-to-even for fp16 in im2col/rope_head (#10767)

Vulkan doesn't mandate a specific rounding mode, but the shader_float_controls
feature allows rounding mode to be requested if the implementation supports it.

b4291

09 Dec 05:25
ce8784b
Compare
Choose a tag to compare
server : fix format_infill (#10724)

* server : fix format_infill

* fix

* rename

* update test

* use another model

* update test

* update test

* test_invalid_input_extra_req

b4267

05 Dec 01:16
f112d19
Compare
Choose a tag to compare
Update deprecation-warning.cpp (#10619)

Fixed Path Separator Handling for Cross-Platform Support (Windows File Systems)

b4255

04 Dec 00:00
cc98896
Compare
Choose a tag to compare
vulkan: optimize and reenable split_k (#10637)

Use vector loads when possible in mul_mat_split_k_reduce. Use split_k
when there aren't enough workgroups to fill the shaders.

b4242

03 Dec 00:51
642330a
Compare
Choose a tag to compare
llama : add enum for built-in chat templates (#10623)

* llama : add enum for supported chat templates

* use "built-in" instead of "supported"

* arg: print list of built-in templates

* fix test

* update server README

b4226

30 Nov 00:13
7cc2d2c
Compare
Choose a tag to compare
ggml : move AMX to the CPU backend (#10570)

* ggml : move AMX to the CPU backend

---------

Co-authored-by: Georgi Gerganov <[email protected]>