-
Notifications
You must be signed in to change notification settings - Fork 14.1k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
convert : refactor rope scaling handling
python
python script changes
#18013
opened Dec 14, 2025 by
CISC
Loading…
webui: fix chat screen shadow width
examples
server
#18010
opened Dec 13, 2025 by
polydecay
Loading…
model-conversion : cast logits to float32
examples
python
python script changes
#18009
opened Dec 13, 2025 by
ggerganov
Loading…
CLI: fixed adding cli and completion into docker containers, improved docs
devops
improvements to build systems and github actions
documentation
Improvements or additions to documentation
#18003
opened Dec 13, 2025 by
andrew-aladev
Loading…
Clarify that steps also apply to linux
documentation
Improvements or additions to documentation
#18002
opened Dec 13, 2025 by
alosslessdev
Loading…
arg: clarify auto kvu/np being set on server
examples
server
#17997
opened Dec 13, 2025 by
ngxson
Loading…
Optimization: Qwen3 next autoregressive pass
model
Model specific
#17996
opened Dec 13, 2025 by
pwilkin
Loading…
CLI: fixed dead links to tools/main for cli and completion, fixed code owners
documentation
Improvements or additions to documentation
examples
#17993
opened Dec 13, 2025 by
andrew-aladev
Loading…
HIP: Refactor mma for RDNA and CDNA
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#17990
opened Dec 13, 2025 by
zhang-hui-yulo
•
Draft
1 task
sync : ggml
ggml
changes relating to the ggml tensor library for machine learning
script
Script related
#17988
opened Dec 13, 2025 by
ggerganov
Loading…
kv-cache: Fix state restore fragmented cache
testing
Everything test related
#17982
opened Dec 13, 2025 by
ssweens
Loading…
webui: fix chat header width when sidebar is closed
examples
server
#17981
opened Dec 13, 2025 by
polydecay
Loading…
ggml-hexagon: Implement true Q8_0 quantization on Hexagon NPU for more accurate mixed-precision matmul operations
ggml
changes relating to the ggml tensor library for machine learning
#17977
opened Dec 12, 2025 by
ngdxzy
Loading…
server: support global section of presets
examples
server
#17959
opened Dec 12, 2025 by
ngxson
Loading…
server: add encoder-decoder model support (T5, BART, MADLAD)
examples
server
#17956
opened Dec 12, 2025 by
Turee
Loading…
vulkan: Add perf logger mode with concurrency
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#17944
opened Dec 11, 2025 by
jeffbolznv
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.