Speed.AI is an Android app for benchmarking locally run LLM models using the Llama.cpp engine. You can also chat with LLMs and view detailed benchmarking data.
Clone the repository and initialize its submodules:
git clone https://github.com/TIC-13/llm-benchmarking-lcpp.git
cd llm-benchmarking-lcpp
git submodule update --init --recursive
Benchmarking results are sent to a ranking system, where you can compare performance across different devices. The ranking is shared with the Speed.AI - AI Benchmarking app.
To host your own ranking instance, check out these repositories:
Hosting a ranking instance is optional; the app works independently.
To connect the app to a ranking instance, configure the following environment variables in local.properties
:
API_ADDRESS
: Backend addressAPI_KEY
: A base64-encoded 32-byte (AES-256) key (must match the backend key)RANKING_ADDRESS
: URL of the ranking server
To add a new model, edit LLMViewModel.kt
and append its Hugging Face .gguf
download URL to the huggingFaceUrls
list.
The inference engine in this app is taken from SmolChat.