Add support for text generation via language models #223

WhyPenguins · 2025-12-21T21:22:24Z

Description

This draft PR adds support for downloading and running language models via SplashKit, using the llama.cpp library. Usage is as simple as:

write_line( generate_reply(QWEN3_0_6B_INSTRUCT, "What is the capital of Australia? Answer with one word.") );

I'd be interested to know if this is on the right track, or if there are any changes that would make it more likely to be merged in. Thanks!

Details

Llama.cpp is used to perform inference for the language models - it has been added as a submodule to splashkit-external, and added to CMakeLists.txt as an External Project rather than a subdirectory. This was done so that it could have settings configured independently to the main project (in particular being set to Release mode, which is much quicker).

On the API side there is an enum that contains a list of supported models (language_model), and an accompanying array that contain URLs, names, and default inference settings (models, in genai.cpp). At least for now I've built llama.cpp so that only CPU inferencing is supported, so the models are chosen such that they still run at acceptable speeds, and also download within a reasonable amount of time (500mb ~ 1.7gb).

When first used, a model is auto-downloaded if it doesn't already exist in ~/.splashkit/models/... - this download can be resumed if interrupted (see sk_http_get_file).

The model is then loaded, the user's prompt formatted (if in "reply" mode) and tokenized, and then the output text is recorded and returned to the user. The backend (genai_backend.cpp/.h) abstracts this out so that tokens can be fetched one at a time (used in __generate_common in genai.cpp).

It's also possible to stream the text back, by using conversation objects. These can be created with create_conversation(...), and have functions for adding new messages, and receiving individual tokens + information about them. test_genai.cpp shows the current usage - there are some rough edges still to be fixed up but here's how it can look now:

The supported models list also contains base, instruct, and thinking variants for each model (where released).

Basic usage looks like:

// Generates a reply to a prompt
string generate_reply(string prompt);
// Generates text that continues from existing text (similar to auto-complete)
string generate_text(string text);

These use the default Qwen3 0.6B Instruct model. Overloads allow for changing settings - there is an overload to simply set the model via an enum, and also an overload that allows for changing all settings. For example:

generate_reply("Hello!", option_max_tokens(option_language_model(GEMMA3_1B_INSTRUCT), 1000));

In the finished PR each option will be exposed, similar to drawing_options.

Hopefully that's generally on the right track, let me know if there's anything that needs adjustment!

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Documentation (update or new)

How Has This Been Tested?

So far testing has been a bit limited:

I've tested downloading and running four of the supported models on Linux
Tested download errors when disconnected from the Internet
Tested errors when models are corrupted/incomplete
Testing a variety of prompts with generate_text and generate_reply to ensure the model versions are usable
Have also added a simple genai_test in sktest, though I plan to expand it a bit further

I would like to test the PR on Windows as well, and ensure all the models download and run.

Testing Checklist

Tested with sktest
Tested with skunit_tests

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have requested a review from ... on the Pull Request

- better for large files/handles resuming

- also add various overloads

…ogram

WhyPenguins added 18 commits December 20, 2025 01:40

Fix: Take into account HOME env var in path_to_user_home

967a4b2

Add llama.cpp as build dependency

abe5946

Add initial GenAI driver and user facing generate_reply

43baf05

Add simple GenAI test

8c1f662

GenAI add custom logger

f172f7b

Add sk_http_get_file function

ddac4c9

- better for large files/handles resuming

Add GenAI model downloading

97b3fa8

Add language_model and language_model_options struct/enum

bedecf2

Add default models and allow choosing model

7d1ad32

- also add various overloads

Pass inference settings to genai_driver

3f94760

genai_driver formatting fixes

f5da09f

Fix CMakeLists so llama.cpp links correctly on first make

1742e9b

Remove OpenMP dependency

47ddd98

Fix llama.cpp linking on windows

7af66dc

Fix llama.cpp flags for MacOS

b7503f0

Fix genai enum header docs

2af41de

Fix genai function header docs

5afa6f0

Make generations reproducible, fixed seed

9341ffb

WhyPenguins force-pushed the genai branch from 7aa8461 to 9341ffb Compare December 22, 2025 20:48

Conversation & Streaming support + some refactoring + updated test pr…

dbb1762

…ogram

WhyPenguins force-pushed the genai branch from 2f6015b to dbb1762 Compare December 23, 2025 03:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for text generation via language models #223

Add support for text generation via language models #223

Uh oh!

WhyPenguins commented Dec 21, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add support for text generation via language models #223

Are you sure you want to change the base?

Add support for text generation via language models #223

Uh oh!

Conversation

WhyPenguins commented Dec 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Details

Type of change

How Has This Been Tested?

Testing Checklist

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

WhyPenguins commented Dec 21, 2025 •

edited

Loading