Skip to content

[pull] master from ggml-org:master #205

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 83 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
20b7bf8
convert : fix smollm3 jinja template (#14586)
ngxson Jul 9, 2025
0465506
model : add support for Falcon-H1 family (#14534)
ibrahimkhadraoui Jul 9, 2025
1055545
llama : remove unintended whitespace (#14592)
CISC Jul 9, 2025
ffd59e7
model : add skt/A.X-4.0 model vocabulary (#14589)
Bing-su Jul 9, 2025
26a48ad
ggml : prevent integer overflow in gguf tensor size calculation (#14595)
Yuuoniy Jul 9, 2025
98bab63
ggml : add ggml_scale_bias (#14417)
ngxson Jul 9, 2025
4a5686d
llama : support Jamba hybrid Transformer-Mamba models (#7531)
compilade Jul 9, 2025
cb9178f
llama : remove llm_graph_input_one (#14603)
ngxson Jul 9, 2025
a57d1bc
cuda : support Falcon-H1 state size for SSM_SCAN (#14602)
compilade Jul 10, 2025
ac44eb6
cmake : llguidance build parser library only (#14608)
EZForever Jul 10, 2025
f9a867f
cmake : bump llguidance version to v1.0.1 (#14609)
EZForever Jul 10, 2025
435a6d1
llama : minor coding style fix for smollm3 (#14605)
ngxson Jul 10, 2025
704bb7a
SYCL: Initial set_rows kernel implementation (#14562)
qnixsynapse Jul 10, 2025
a457551
cmake : do not search for curl libraries by ourselves (#14613)
EZForever Jul 10, 2025
11ee0fe
Docs: script to auto-generate ggml operations docs (#14598)
am17an Jul 10, 2025
4bb625b
Smoldocling support (#14597)
ryan-mangeno Jul 10, 2025
0b88557
opencl: add `set_rows` for `f16` and `f32` (#14547)
lhez Jul 10, 2025
6bdda13
opencl: add tiled mul_mat_f16_f32 (#14535)
rmatif Jul 10, 2025
0aedae0
model : Granite Four (#13550)
gabe-l-hart Jul 11, 2025
576c82e
vocab : add midm-2.0 model pre-tokenizer (#14626)
Bing-su Jul 11, 2025
0d5375d
llama : move enum llama_vocab_pre_type to implementation (#14631)
ggerganov Jul 11, 2025
aaa088d
readme : add hot PRs (#14636)
ggerganov Jul 11, 2025
756aa10
HIP : Add HIP 7.0+ compatibility for hipBLAS compute types (#14634)
slojosic-amd Jul 11, 2025
f5e96b3
model : support LiquidAI LFM2 hybrid family (#14620)
tdakhran Jul 11, 2025
98197e5
vulkan: optimizations for deepseek prompt processing (#14555)
jeffbolznv Jul 12, 2025
b3ad3a0
vulkan: support SET_ROWS (#14587)
jeffbolznv Jul 12, 2025
0c1df14
server : fix pooled embedding output (#14645)
iamlemec Jul 12, 2025
3e303b1
vulkan : implement ggml_roll (ggml/1290)
Acly Jul 12, 2025
74bb294
vulkan : implement bilinear interpolation (ggml/1291)
Acly Jul 12, 2025
2155357
sync : ggml
ggerganov Jul 12, 2025
3120413
vulkan : remove unused vars (#0)
ggerganov Jul 12, 2025
8eff955
sync : ggml
ggerganov Jul 12, 2025
7de5c7c
CUDA: add set rows for f32 and f16 (#14551)
am17an Jul 12, 2025
67eade1
docs : add LFM2 to models section (#14650)
tdakhran Jul 12, 2025
c31e606
tests : cover lfm2 cases in test_ssm_conv (#14651)
tdakhran Jul 12, 2025
84b396e
cmake : Add CMake presets for Linux and GCC (#14656)
YavorGIvanov Jul 13, 2025
dcf7f2e
metal : Add missing unary ops Metal support (#14660)
YavorGIvanov Jul 13, 2025
05fec5b
ggml : add build-time message to remind about ggml_set_rows (#14661)
ggerganov Jul 13, 2025
e743cdd
cuda : add ELU support (#14657)
YavorGIvanov Jul 13, 2025
923e3ea
cuda : add set rows for bf16 (#14664)
CISC Jul 13, 2025
982e347
quantize : fix minor logic flaw in --tensor-type (#14572)
EAddario Jul 13, 2025
0d92267
llama : add jinja template for rwkv-world (#14665)
MollySophia Jul 13, 2025
65a3ebb
sycl: Batched mulmat rework for oneDNN dispatch (#14617)
ShanoToni Jul 14, 2025
0f4c6ec
SYCL: use 1D kernel for set_rows (#14618)
qnixsynapse Jul 14, 2025
494c589
scripts: benchmark for HTTP server throughput (#14668)
JohannesGaessler Jul 14, 2025
9c9e4fc
llama-context: add ability to get logits (#14672)
am17an Jul 14, 2025
55c509d
ggml : refactor llamafile_sgemm PPC code (#14673)
shalinib-ibm Jul 14, 2025
bdca383
sycl: Hotfix for non dnnl codepath (#14677)
ShanoToni Jul 14, 2025
cbc68be
cuda: fix build warnings in set-rows.cu (unused variable) (#14687)
yeahdongcn Jul 15, 2025
68e37a6
model : add PLaMo-2 support (#14560)
mitmul Jul 15, 2025
10a0351
vulkan: add RTE variants for glu/add/sub/mul/div (#14653)
jeffbolznv Jul 15, 2025
ba1ceb3
vulkan: fix noncontig check for mat_mul_id splitting (#14683)
jeffbolznv Jul 15, 2025
4a4f426
model : add Kimi-K2 support (#14654)
gabriellarson Jul 15, 2025
c81f419
gguf-py : dump bpw per layer and model in markdown mode (#14703)
EAddario Jul 15, 2025
79e0b68
llama: add LLAMA_API to deprecated llama_kv_self_seq_div (#14708)
Min-Hua Jul 16, 2025
cf91f21
convert : add pre-computed hashes first to prevent order mishaps (#14…
CISC Jul 16, 2025
4b91d6f
convert : only check for tokenizer folder if we need it (#14704)
CISC Jul 16, 2025
5cae766
scripts: synthetic prompt mode for server-bench.py (#14695)
JohannesGaessler Jul 16, 2025
538cc77
server : fix handling of the ignore_eos flag (#14710)
ggerganov Jul 16, 2025
e4841d2
llama : fix parallel processing for plamo2 (#14716)
mitmul Jul 16, 2025
6ffd4e9
server : pre-calculate EOG logit biases (#14721)
ggerganov Jul 16, 2025
6497834
ggml : add asserts (#14720)
ggerganov Jul 16, 2025
ab14019
Support diffusion models: Add Dream 7B (#14644)
am17an Jul 16, 2025
225e7a1
llama : add high-throughput mode (#14363)
ggerganov Jul 16, 2025
b0f0ecc
model : support output bias for qwen2 (#14711)
tempstudio Jul 16, 2025
21c0217
ggml: Add initial WebGPU backend (#14521)
reeselevine Jul 16, 2025
496957e
llama : fix parameter order for hybrid memory initialization (#14725)
dinerburger Jul 16, 2025
19e5943
convert : make hf token optional (#14717)
CISC Jul 16, 2025
1ba45d4
ci : disable failing vulkan crossbuilds (#14723)
CISC Jul 16, 2025
ad57d3e
batch : fix uninitialized has_cpl flag (#14733)
ggerganov Jul 17, 2025
d9b6910
kv-cache : opt mask set input (#14600)
ggerganov Jul 17, 2025
086cf81
llama : fix parallel processing for lfm2 (#14705)
tdakhran Jul 17, 2025
01612b7
llama : reuse compute graphs (#14482)
ggerganov Jul 17, 2025
d6fb3f6
kv-cache : fix k-shift for multiple streams (#14742)
ggerganov Jul 17, 2025
cb887f1
model: add Ernie 4.5 MoE support (#14658)
pwilkin Jul 17, 2025
760b448
nix : use optionalAttrs for env mkDerivation attrset argument (#14726)
amozeo Jul 17, 2025
670e136
convert : fix Ernie4.5 MoE without shared experts (#14746)
pwilkin Jul 17, 2025
349ea79
use max work group size for device to replace the magic number (#14732)
NeoZhangJianyu Jul 18, 2025
09651d0
graph : Pass the graph placeholder message in debug mode (#14748)
Nexesenex Jul 18, 2025
8f974bc
graph : refactor context to not pass gf explicitly (#14629)
ggerganov Jul 18, 2025
f9a31ee
CUDA: set_rows + cpy.cu refactor (#14712)
am17an Jul 18, 2025
e0cb5c5
model : add EXAONE 4.0 support (#14630)
lgai-exaone Jul 18, 2025
eacdeb5
model : fix build after merge conflict (#14754)
ggerganov Jul 18, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .devops/nix/package.nix
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ let
inherit (lib)
cmakeBool
cmakeFeature
optionalAttrs
optionals
strings
;
Expand Down Expand Up @@ -197,7 +198,7 @@ effectiveStdenv.mkDerivation (finalAttrs: {
];

# Environment variables needed for ROCm
env = optionals useRocm {
env = optionalAttrs useRocm {
ROCM_PATH = "${rocmPackages.clr}";
HIP_DEVICE_LIB_PATH = "${rocmPackages.rocm-device-libs}/amdgcn/bitcode";
};
Expand Down
276 changes: 138 additions & 138 deletions .github/workflows/build-linux-cross.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,98 +48,98 @@ jobs:

cmake --build build --config Release -j $(nproc)

ubuntu-24-riscv64-vulkan-cross:
runs-on: ubuntu-24.04

steps:
- uses: actions/checkout@v4
- name: Setup Riscv
run: |
sudo dpkg --add-architecture riscv64

# Add arch-specific repositories for non-amd64 architectures
cat << EOF | sudo tee /etc/apt/sources.list.d/riscv64-ports.list
deb [arch=riscv64] http://ports.ubuntu.com/ubuntu-ports/ noble main universe
deb [arch=riscv64] http://ports.ubuntu.com/ubuntu-ports/ noble-updates main universe
deb [arch=riscv64] http://ports.ubuntu.com/ubuntu-ports/ noble-security main universe
deb [arch=riscv64] http://ports.ubuntu.com/ubuntu-ports/ noble-backports main universe
EOF

sudo apt-get update || true ;# Prevent failure due to missing URLs.

sudo apt-get install -y --no-install-recommends \
build-essential \
glslc \
gcc-14-riscv64-linux-gnu \
g++-14-riscv64-linux-gnu \
libvulkan-dev:riscv64

- name: Build
run: |
cmake -B build -DLLAMA_CURL=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DGGML_VULKAN=ON \
-DGGML_OPENMP=OFF \
-DLLAMA_BUILD_EXAMPLES=ON \
-DLLAMA_BUILD_TOOLS=ON \
-DLLAMA_BUILD_TESTS=OFF \
-DCMAKE_SYSTEM_NAME=Linux \
-DCMAKE_SYSTEM_PROCESSOR=riscv64 \
-DCMAKE_C_COMPILER=riscv64-linux-gnu-gcc-14 \
-DCMAKE_CXX_COMPILER=riscv64-linux-gnu-g++-14 \
-DCMAKE_POSITION_INDEPENDENT_CODE=ON \
-DCMAKE_FIND_ROOT_PATH=/usr/lib/riscv64-linux-gnu \
-DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER \
-DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY \
-DCMAKE_FIND_ROOT_PATH_MODE_INCLUDE=BOTH

cmake --build build --config Release -j $(nproc)

ubuntu-24-arm64-vulkan-cross:
runs-on: ubuntu-24.04

steps:
- uses: actions/checkout@v4
- name: Setup Arm64
run: |
sudo dpkg --add-architecture arm64

# Add arch-specific repositories for non-amd64 architectures
cat << EOF | sudo tee /etc/apt/sources.list.d/arm64-ports.list
deb [arch=arm64] http://ports.ubuntu.com/ubuntu-ports/ noble main universe
deb [arch=arm64] http://ports.ubuntu.com/ubuntu-ports/ noble-updates main universe
deb [arch=arm64] http://ports.ubuntu.com/ubuntu-ports/ noble-security main universe
deb [arch=arm64] http://ports.ubuntu.com/ubuntu-ports/ noble-backports main universe
EOF

sudo apt-get update || true ;# Prevent failure due to missing URLs.

sudo apt-get install -y --no-install-recommends \
build-essential \
glslc \
crossbuild-essential-arm64 \
libvulkan-dev:arm64

- name: Build
run: |
cmake -B build -DLLAMA_CURL=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DGGML_VULKAN=ON \
-DGGML_OPENMP=OFF \
-DLLAMA_BUILD_EXAMPLES=ON \
-DLLAMA_BUILD_TOOLS=ON \
-DLLAMA_BUILD_TESTS=OFF \
-DCMAKE_SYSTEM_NAME=Linux \
-DCMAKE_SYSTEM_PROCESSOR=aarch64 \
-DCMAKE_C_COMPILER=aarch64-linux-gnu-gcc \
-DCMAKE_CXX_COMPILER=aarch64-linux-gnu-g++ \
-DCMAKE_POSITION_INDEPENDENT_CODE=ON \
-DCMAKE_FIND_ROOT_PATH=/usr/lib/aarch64-linux-gnu \
-DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER \
-DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY \
-DCMAKE_FIND_ROOT_PATH_MODE_INCLUDE=BOTH

cmake --build build --config Release -j $(nproc)
# ubuntu-24-riscv64-vulkan-cross:
# runs-on: ubuntu-24.04

# steps:
# - uses: actions/checkout@v4
# - name: Setup Riscv
# run: |
# sudo dpkg --add-architecture riscv64

# # Add arch-specific repositories for non-amd64 architectures
# cat << EOF | sudo tee /etc/apt/sources.list.d/riscv64-ports.list
# deb [arch=riscv64] http://ports.ubuntu.com/ubuntu-ports/ noble main universe
# deb [arch=riscv64] http://ports.ubuntu.com/ubuntu-ports/ noble-updates main universe
# deb [arch=riscv64] http://ports.ubuntu.com/ubuntu-ports/ noble-security main universe
# deb [arch=riscv64] http://ports.ubuntu.com/ubuntu-ports/ noble-backports main universe
# EOF

# sudo apt-get update || true ;# Prevent failure due to missing URLs.

# sudo apt-get install -y --no-install-recommends \
# build-essential \
# glslc \
# gcc-14-riscv64-linux-gnu \
# g++-14-riscv64-linux-gnu \
# libvulkan-dev:riscv64

# - name: Build
# run: |
# cmake -B build -DLLAMA_CURL=OFF \
# -DCMAKE_BUILD_TYPE=Release \
# -DGGML_VULKAN=ON \
# -DGGML_OPENMP=OFF \
# -DLLAMA_BUILD_EXAMPLES=ON \
# -DLLAMA_BUILD_TOOLS=ON \
# -DLLAMA_BUILD_TESTS=OFF \
# -DCMAKE_SYSTEM_NAME=Linux \
# -DCMAKE_SYSTEM_PROCESSOR=riscv64 \
# -DCMAKE_C_COMPILER=riscv64-linux-gnu-gcc-14 \
# -DCMAKE_CXX_COMPILER=riscv64-linux-gnu-g++-14 \
# -DCMAKE_POSITION_INDEPENDENT_CODE=ON \
# -DCMAKE_FIND_ROOT_PATH=/usr/lib/riscv64-linux-gnu \
# -DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER \
# -DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY \
# -DCMAKE_FIND_ROOT_PATH_MODE_INCLUDE=BOTH

# cmake --build build --config Release -j $(nproc)

# ubuntu-24-arm64-vulkan-cross:
# runs-on: ubuntu-24.04

# steps:
# - uses: actions/checkout@v4
# - name: Setup Arm64
# run: |
# sudo dpkg --add-architecture arm64

# # Add arch-specific repositories for non-amd64 architectures
# cat << EOF | sudo tee /etc/apt/sources.list.d/arm64-ports.list
# deb [arch=arm64] http://ports.ubuntu.com/ubuntu-ports/ noble main universe
# deb [arch=arm64] http://ports.ubuntu.com/ubuntu-ports/ noble-updates main universe
# deb [arch=arm64] http://ports.ubuntu.com/ubuntu-ports/ noble-security main universe
# deb [arch=arm64] http://ports.ubuntu.com/ubuntu-ports/ noble-backports main universe
# EOF

# sudo apt-get update || true ;# Prevent failure due to missing URLs.

# sudo apt-get install -y --no-install-recommends \
# build-essential \
# glslc \
# crossbuild-essential-arm64 \
# libvulkan-dev:arm64

# - name: Build
# run: |
# cmake -B build -DLLAMA_CURL=OFF \
# -DCMAKE_BUILD_TYPE=Release \
# -DGGML_VULKAN=ON \
# -DGGML_OPENMP=OFF \
# -DLLAMA_BUILD_EXAMPLES=ON \
# -DLLAMA_BUILD_TOOLS=ON \
# -DLLAMA_BUILD_TESTS=OFF \
# -DCMAKE_SYSTEM_NAME=Linux \
# -DCMAKE_SYSTEM_PROCESSOR=aarch64 \
# -DCMAKE_C_COMPILER=aarch64-linux-gnu-gcc \
# -DCMAKE_CXX_COMPILER=aarch64-linux-gnu-g++ \
# -DCMAKE_POSITION_INDEPENDENT_CODE=ON \
# -DCMAKE_FIND_ROOT_PATH=/usr/lib/aarch64-linux-gnu \
# -DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER \
# -DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY \
# -DCMAKE_FIND_ROOT_PATH_MODE_INCLUDE=BOTH

# cmake --build build --config Release -j $(nproc)

ubuntu-24-ppc64el-cpu-cross:
runs-on: ubuntu-24.04
Expand Down Expand Up @@ -185,52 +185,52 @@ jobs:

cmake --build build --config Release -j $(nproc)

ubuntu-24-ppc64el-vulkan-cross:
runs-on: ubuntu-24.04

steps:
- uses: actions/checkout@v4
- name: Setup PowerPC64le
run: |
sudo dpkg --add-architecture ppc64el

# Add arch-specific repositories for non-amd64 architectures
cat << EOF | sudo tee /etc/apt/sources.list.d/ppc64el-ports.list
deb [arch=ppc64el] http://ports.ubuntu.com/ubuntu-ports/ noble main universe
deb [arch=ppc64el] http://ports.ubuntu.com/ubuntu-ports/ noble-updates main universe
deb [arch=ppc64el] http://ports.ubuntu.com/ubuntu-ports/ noble-security main universe
deb [arch=ppc64el] http://ports.ubuntu.com/ubuntu-ports/ noble-backports main universe
EOF

sudo apt-get update || true ;# Prevent failure due to missing URLs.

sudo apt-get install -y --no-install-recommends \
build-essential \
glslc \
gcc-14-powerpc64le-linux-gnu \
g++-14-powerpc64le-linux-gnu \
libvulkan-dev:ppc64el

- name: Build
run: |
cmake -B build -DLLAMA_CURL=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DGGML_VULKAN=ON \
-DGGML_OPENMP=OFF \
-DLLAMA_BUILD_EXAMPLES=ON \
-DLLAMA_BUILD_TOOLS=ON \
-DLLAMA_BUILD_TESTS=OFF \
-DCMAKE_SYSTEM_NAME=Linux \
-DCMAKE_SYSTEM_PROCESSOR=ppc64 \
-DCMAKE_C_COMPILER=powerpc64le-linux-gnu-gcc-14 \
-DCMAKE_CXX_COMPILER=powerpc64le-linux-gnu-g++-14 \
-DCMAKE_POSITION_INDEPENDENT_CODE=ON \
-DCMAKE_FIND_ROOT_PATH=/usr/lib/powerpc64le-linux-gnu \
-DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER \
-DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY \
-DCMAKE_FIND_ROOT_PATH_MODE_INCLUDE=BOTH

cmake --build build --config Release -j $(nproc)
# ubuntu-24-ppc64el-vulkan-cross:
# runs-on: ubuntu-24.04

# steps:
# - uses: actions/checkout@v4
# - name: Setup PowerPC64le
# run: |
# sudo dpkg --add-architecture ppc64el

# # Add arch-specific repositories for non-amd64 architectures
# cat << EOF | sudo tee /etc/apt/sources.list.d/ppc64el-ports.list
# deb [arch=ppc64el] http://ports.ubuntu.com/ubuntu-ports/ noble main universe
# deb [arch=ppc64el] http://ports.ubuntu.com/ubuntu-ports/ noble-updates main universe
# deb [arch=ppc64el] http://ports.ubuntu.com/ubuntu-ports/ noble-security main universe
# deb [arch=ppc64el] http://ports.ubuntu.com/ubuntu-ports/ noble-backports main universe
# EOF

# sudo apt-get update || true ;# Prevent failure due to missing URLs.

# sudo apt-get install -y --no-install-recommends \
# build-essential \
# glslc \
# gcc-14-powerpc64le-linux-gnu \
# g++-14-powerpc64le-linux-gnu \
# libvulkan-dev:ppc64el

# - name: Build
# run: |
# cmake -B build -DLLAMA_CURL=OFF \
# -DCMAKE_BUILD_TYPE=Release \
# -DGGML_VULKAN=ON \
# -DGGML_OPENMP=OFF \
# -DLLAMA_BUILD_EXAMPLES=ON \
# -DLLAMA_BUILD_TOOLS=ON \
# -DLLAMA_BUILD_TESTS=OFF \
# -DCMAKE_SYSTEM_NAME=Linux \
# -DCMAKE_SYSTEM_PROCESSOR=ppc64 \
# -DCMAKE_C_COMPILER=powerpc64le-linux-gnu-gcc-14 \
# -DCMAKE_CXX_COMPILER=powerpc64le-linux-gnu-g++-14 \
# -DCMAKE_POSITION_INDEPENDENT_CODE=ON \
# -DCMAKE_FIND_ROOT_PATH=/usr/lib/powerpc64le-linux-gnu \
# -DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER \
# -DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY \
# -DCMAKE_FIND_ROOT_PATH_MODE_INCLUDE=BOTH

# cmake --build build --config Release -j $(nproc)

debian-13-loongarch64-cpu-cross:
runs-on: ubuntu-24.04
Expand Down
Loading
Loading