This project uses llama-cpp-python with CUDA acceleration.
Follow the steps below to set up your environment.
- Linux (tested on Arch Linux)
- NVIDIA GPU (tested on RTX 4060)
- NVIDIA drivers (>= 550 for CUDA 13)
- CUDA Toolkit (installed via
pacman) - Python 3.10+ (tested on 3.13)
- cmake and gcc for building native code
Note: If installing without CUDA acceleration, you don't need cuda or nvidia gpu/drivers
On Arch Linux:
sudo pacman -S --needed cuda gcc cmake python python-pipNote: Remove cuda gcc and cmake if you don't want to use GPU Acceleration
Check that nvcc is available in your path:
nvcc --versionIf nvcc is not found, add this to your shell config (~/.bashrc or ~/.zshrc) or run in your terminal directly
export PATH=/opt/cuda/bin:$PATH
export CUDAToolkit_Root=/opt/cuda
export LD_LIBRARY_PATH=/opt/cuda/lib64:$LD_LIBRARY_PATHIf you updated your shell config:
source ~/.bashrcThis project downloaded its models from HuggingFace using the following links
Fast(recommended Q6_K_L): https://huggingface.co/bartowski/Ministral-8B-Instruct-2410-GGUF
Deep (recommended Q4_K_M): https://huggingface.co/TheBloke/Mixtral-8x7B-v0.1-GGUF
You can use whatever models you want and edit the config file as necessary
Create a virtual environment
python -m venv llama-env
source llama-env/bin/activate
pip install --upgrade pipInstall requirements for this repo:
pip install -r requirements.txtAutomatically, llama-cpp-python is CPU-only. If you want to enable GPU-Acceleration,
CMAKE_ARGS="-GGML_CUDA=on" pip install --force-reinstall --no-cache-dir llama-cpp-pythonImportant configs to change:
- VAULT_DIR: initially set to be in your home directory and named Vault
- models (fast and deep): initially set to be in a models folder within the project directory. Make sure to update with whichever models you chose
Create the bash script (recommended location /usr/local/bin)
#!/bin/bash
STITCH_PATH="$HOME/Stitch"
VENV_PYTHON="$STITCH_PATH/llama-env/bin/python"
MAIN_SCRIPT="$STITCH_PATH/scripts/main.py"
# Pass all arguments to main.py
"$VENV_PYTHON" "$MAIN_SCRIPT" "$@"Weirdly I can't find where I added it to my path but it should be here in an ideal world. Then edit your shell config file:
nano ~/.bashrcThen put
export PATH="$HOME/stitch:$PATH"And reload your shell
source ~/.bashrcOnce you've installed everything, you can activate your environment and run:
source llama-env/bin/activate
python main.py