-
Notifications
You must be signed in to change notification settings - Fork 13.9k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
cuda : add error checking for cudaMemcpyAsync in argsort (#12836)
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#17599
opened Nov 29, 2025 by
Mahekk357
Loading…
Load Sliding Window Attention (SWA) pattern from GGUF metadata
#17597
opened Nov 29, 2025 by
taylorchu
Loading…
server: move server-context to its own cpp|h
examples
server
#17595
opened Nov 29, 2025 by
ngxson
Loading…
Feature/kimi linear support
ggml
changes relating to the ggml tensor library for machine learning
model
Model specific
Nvidia GPU
Issues specific to Nvidia GPUs
python
python script changes
#17592
opened Nov 29, 2025 by
cacaview
Loading…
update
LLAMA_ARG_KV_SPLIT --> LLAMA_ARG_KV_UNIFIED to match CLI argument
#17588
opened Nov 29, 2025 by
ddh0
Loading…
Override SSM_A op for Qwen3 Next to reduce splits
model
Model specific
#17587
opened Nov 29, 2025 by
pwilkin
Loading…
Add support for CUMSUM and TRI for CUDA.
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
testing
Everything test related
#17584
opened Nov 28, 2025 by
pwilkin
Loading…
cmake: fix macOS build with changes relating to the ggml tensor library for machine learning
-DGGML_BACKEND_DL=ON
ggml
#17581
opened Nov 28, 2025 by
giladgd
Loading…
Add PagedAttention support (experimental, CUDA only)
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#17579
opened Nov 28, 2025 by
ericcurtin
Loading…
model: LFM2-VL fixes
examples
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
testing
Everything test related
#17577
opened Nov 28, 2025 by
tdakhran
Loading…
HIP: enable WMMA-MMQ INT kernels for RDNA 3
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#17576
opened Nov 28, 2025 by
jiachengjason
•
Draft
[SYCL] enhance argsort for UT
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#17573
opened Nov 28, 2025 by
NeoZhangJianyu
Loading…
Server: Change Invalid Schema from Server Error (500) to User Error (400)
examples
python
python script changes
server
testing
Everything test related
#17572
opened Nov 28, 2025 by
chadvoegele
Loading…
ggml-hexagon: fix changes relating to the ggml tensor library for machine learning
rope failure at test-backend-ops
ggml
#17565
opened Nov 28, 2025 by
chraac
Loading…
CANN: The Ger operator of OUT_PROD is not supported on the 310p device
Ascend NPU
issues specific to Ascend NPUs
ggml
changes relating to the ggml tensor library for machine learning
#17563
opened Nov 28, 2025 by
TianHao324
Loading…
Fix unreadable user markdown colors and truncate long texts in deletion dialogs
examples
server
#17555
opened Nov 27, 2025 by
ServeurpersoCom
Loading…
ggml-cpu: Add operator-level execution time profiling
ggml
changes relating to the ggml tensor library for machine learning
#17544
opened Nov 27, 2025 by
kimminsu38oo
Loading…
CANN: add support for partial RoPE and Vision mode
Ascend NPU
issues specific to Ascend NPUs
ggml
changes relating to the ggml tensor library for machine learning
#17543
opened Nov 27, 2025 by
noemotiovon
Loading…
Previous Next
ProTip!
Filter pull requests by the default branch with base:master.