feat: do not bundle llama-cpp anymore #5790

mudler · 2025-07-04T16:27:58Z

Description

This PR fixes #

Notes for Reviewers

Yes, I signed my commits.

netlify · 2025-07-04T16:28:02Z

✅ Deploy Preview for localai ready!

Name	Link
🔨 Latest commit	`feecf58`
🔍 Latest deploy log	https://app.netlify.com/projects/localai/deploys/687a2abfc9ffd40008172e46
😎 Deploy Preview	https://deploy-preview-5790--localai.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

richiejp · 2025-07-08T19:52:01Z

So a completely separate Dockerfile and Makefile? This will be a major improvement!

mudler · 2025-07-09T09:42:44Z

So a completely separate Dockerfile and Makefile? This will be a major improvement!

yup! my plan is to isolate everything, one backend at a time. Currently the llama.cpp one is the most heavy, having also lots of specific code in the golang part - ideally I want to get rid of all of the specific llama.cpp code and the binary bundling bits out of the main code.

This is how I'm testing things now with #5816 in:

docker build --build-arg BACKEND=llama-cpp -t llama-cpp-backend -f backend/Dockerfile.llama-cpp .
docker save llama-cpp-backend -o llama-backend.tar
local-ai backends install "ocifile://$PWD/llama-backend.tar"

Signed-off-by: Ettore Di Giacinto <[email protected]>

mudler · 2025-07-18T10:48:23Z

Any tips on testing this? Does it significantly change the build process for those compiling locally? (So does this require a README update?)

yes actually good point, my plan is to remove all the backends outside so we can build LocalAI in a simpler way by using standard golang tooling. At that point I will re-work documentation, at this stage is not really functional and in a "transient" state. However, for now the steps are the same:

make build # Will build local-ai, with some backends still included in the binary

The difference is in how backends are built. If you want llama-cpp, for example, you can install it from the Backends tab in the webui, or with localai backends install. If you want to build it instead and install you can run:

make docker-build-llama-cpp
make docker-save-llama-cpp
./local-ai backends install ocifile://$PWD/backend-images/llama-cpp.tar

This does the following:

Builds the backend with docker
Save the result as a standard container image
Install it in local-ai (in the default backends folder, next to the binary)

Signed-off-by: Ettore Di Giacinto <[email protected]>

github-actions bot added the dependencies label Jul 4, 2025

mudler force-pushed the feat/build-llama-cpp-externally branch 3 times, most recently from 4980a37 to 608264c Compare July 7, 2025 09:38

mudler mentioned this pull request Jul 8, 2025

feat(whisper): Enable SYCL #5802

Merged

1 task

mudler force-pushed the feat/build-llama-cpp-externally branch 2 times, most recently from d1569f2 to f3b1c38 Compare July 8, 2025 17:22

mudler mentioned this pull request Jul 8, 2025

feat(cli): allow to install backends from OCI tar files #5816

Merged

1 task

mudler force-pushed the feat/build-llama-cpp-externally branch 3 times, most recently from 4005854 to 630fdba Compare July 10, 2025 16:54

mudler added the enhancement New feature or request label Jul 11, 2025

github-actions bot added the ci label Jul 11, 2025

mudler force-pushed the feat/build-llama-cpp-externally branch 3 times, most recently from 0c1529b to ae65455 Compare July 13, 2025 16:28

mudler mentioned this pull request Jul 13, 2025

feat(cli): add command to create custom OCI images from directories #5844

Merged

1 task

mudler force-pushed the feat/build-llama-cpp-externally branch 9 times, most recently from 1072662 to c90a0e8 Compare July 14, 2025 20:44

mudler changed the title ~~[WIP] feat: build llama cpp externally~~ feat: build llama cpp externally Jul 14, 2025

mudler added 24 commits July 18, 2025 09:29

Drop llama-cpp specific logic from the backend loader

972eaa5

Signed-off-by: Ettore Di Giacinto <[email protected]>

drop grpc install in ci for tests

7a7051c

Signed-off-by: Ettore Di Giacinto <[email protected]>

fixups

2cbbda7

Signed-off-by: Ettore Di Giacinto <[email protected]>

Pass by backends path for tests

b70d7de

Signed-off-by: Ettore Di Giacinto <[email protected]>

Build protogen at start

53c4b64

Signed-off-by: Ettore Di Giacinto <[email protected]>

fix(tests): set backends path consistently

362a603

Signed-off-by: Ettore Di Giacinto <[email protected]>

Correctly configure the backends path

c78258b

Signed-off-by: Ettore Di Giacinto <[email protected]>

Try to build for darwin

a72b7f7

Signed-off-by: Ettore Di Giacinto <[email protected]>

WIP

8f4e5b2

Signed-off-by: Ettore Di Giacinto <[email protected]>

Compile for metal on arm64/darwin

7b319db

Signed-off-by: Ettore Di Giacinto <[email protected]>

Try to run build off from cross-arch

3e68ba5

Signed-off-by: Ettore Di Giacinto <[email protected]>

Add to the backend index nvidia-l4t and cpu's llama-cpp backends

06ca2d8

Signed-off-by: Ettore Di Giacinto <[email protected]>

Build also darwin-x86 for llama-cpp

d8ebdee

Signed-off-by: Ettore Di Giacinto <[email protected]>

Disable arm64 builds temporary

bfe4cd9

Signed-off-by: Ettore Di Giacinto <[email protected]>

Test backend build on PR

fecac3f

Signed-off-by: Ettore Di Giacinto <[email protected]>

Fixup build backend reusable workflow

fe37822

Signed-off-by: Ettore Di Giacinto <[email protected]>

pass by skip drivers

60e0dbe

Signed-off-by: Ettore Di Giacinto <[email protected]>

Use crane

63b860e

Signed-off-by: Ettore Di Giacinto <[email protected]>

Skip drivers

58426a0

Signed-off-by: Ettore Di Giacinto <[email protected]>

Fixups

5c1802a

Signed-off-by: Ettore Di Giacinto <[email protected]>

x86 darwin

b97c067

Signed-off-by: Ettore Di Giacinto <[email protected]>

Add packaging step for llama.cpp

3891f31

Signed-off-by: Ettore Di Giacinto <[email protected]>

fixups

88fb292

Signed-off-by: Ettore Di Giacinto <[email protected]>

Fix leftover from bark-cpp extraction

535689a

Signed-off-by: Ettore Di Giacinto <[email protected]>

mudler force-pushed the feat/build-llama-cpp-externally branch from fe05d6b to b038c5a Compare July 18, 2025 07:30

Try to fix hipblas build

feecf58

Signed-off-by: Ettore Di Giacinto <[email protected]>

mudler force-pushed the feat/build-llama-cpp-externally branch from b038c5a to feecf58 Compare July 18, 2025 11:06

mudler merged commit 294f702 into master Jul 18, 2025
27 checks passed

mudler deleted the feat/build-llama-cpp-externally branch July 18, 2025 11:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: do not bundle llama-cpp anymore #5790

feat: do not bundle llama-cpp anymore #5790

Uh oh!

mudler commented Jul 4, 2025

Uh oh!

netlify bot commented Jul 4, 2025 •

edited

Loading

Uh oh!

richiejp commented Jul 8, 2025

Uh oh!

mudler commented Jul 9, 2025 •

edited

Loading

Uh oh!

mudler commented Jul 18, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

feat: do not bundle llama-cpp anymore #5790

feat: do not bundle llama-cpp anymore #5790

Uh oh!

Conversation

mudler commented Jul 4, 2025

Uh oh!

netlify bot commented Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for localai ready!

Uh oh!

richiejp commented Jul 8, 2025

Uh oh!

mudler commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mudler commented Jul 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

netlify bot commented Jul 4, 2025 •

edited

Loading

mudler commented Jul 9, 2025 •

edited

Loading

mudler commented Jul 18, 2025 •

edited

Loading