Author: Harsha Vardhan Khurdula
LLMs in your CPP code? Now possible using, C++ with libcurl to send HTTP requests to a locally served language model (LLM) via ollama. This setup allows you to query a local model efficiently without relying on external services.
This repository provides code to wrap libcurl in C++ for sending HTTP POST requests to a language model served by ollama. You can use this to send prompts and retrieve model responses from any LLM hosted locally.
rest_api.h: Header file defining the REST API client.rest_api.cpp: Implementation of the REST API client using libcurl.main.cpp: Main program for sending requests to the model and processing responses.
- C++11 or later
- libcurl: Used for HTTP requests
- ollama: Used to serve the model locally
-
libcurl: Install libcurl on your machine.
- Linux:
sudo apt-get install libcurl4-openssl-dev - macOS:
brew install curl - Windows: Download and install
- Linux:
-
ollama: Install ollama to serve the model.
brew install ollama
Clone this repository and navigate to the project directory:
git clone https://github.com/Khurdhula-Harshavardhan/LLM-Inference-CPP.git
cd LLM-Inference-CPPUse the following command to compile the C++ code:
g++ -std=c++17 main.cpp rest_api.cpp -o llm_inference -lcurlThis will create an executable named llm_inference.
Start the ollama server locally to serve the language model:
ollama serveWith the ollama server running, execute the compiled C++ code to send prompts to the model.
./llm_inferenceThe program will prompt you to enter:
- LLM Identifier: The model name hosted on
ollama. - Query: Your question or prompt for the model.
Enter LLM identifier: llama3.2
$$$ Why is the sky blue?
====================
llama3.2 says: The sky is blue due to Rayleigh scattering.
====================
Defines the RestClient class, which handles REST API requests with methods such as get and post.
Implements the RestClient class using libcurl.
- post: Constructs and sends a
POSTrequest with JSON data. - get: (optional) Method to handle
GETrequests.
This is the main program file:
- Prompts for a model identifier and query.
- Constructs a JSON payload with these inputs.
- Uses
RestClientto send aPOSTrequest to the ollama server. - Parses and displays the model’s response.
An example JSON payload construction in main.cpp:
std::string jsonData = R"({
"model": ")" + model + R"(",
"prompt": ")" + user_query + R"(",
"stream": false
})";If you use this code, please cite it as follows:
@misc{Khurdula2024LLMInferenceCPP,
author = {Harsha Vardhan Khurdula},
title = {LLM-CPP-Inference: C++ Client for LLM Inference via Ollama},
year = {2024},
howpublished = {\url{https://github.com/Khurdhula-Harshavardhan/LLM-Inference-CPP}},
}
}This documentation provides the necessary steps to query a local LLM with ollama and libcurl in C++. For questions or contributions, please open an issue or submit a pull request.