Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A
-
Updated
Nov 6, 2023 - Python
Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A
Runs LLaMA with Extremely HIGH speed
LLM inference in Fortran
The bare metal in my basement
Portable LLM - A rust library for LLM inference
eLLM Infers Qwen3-480B on a CPU in Real Time
Wrapper for simplified use of Llama2 GGUF quantized models.
V-lang api wrapper for llm-inference chatllm.cpp
Simple large language model playground app
VB.NET api wrapper for llm-inference chatllm.cpp
C# api wrapper for llm-inference chatllm.cpp
PlantAi is a ResNet-based CNN model trained on the PlantVillage dataset to classify plant leaf images as healthy or diseased. This repository includes PyTorch training code, tools to convert the model to TensorFlow Lite (TFLite) for deployment, and an Android app integrating the model for real-time leaf disease detection from camera images.
🧠 A comprehensive toolkit for benchmarking, optimizing, and deploying local Large Language Models. Includes performance testing tools, optimized configurations for CPU/GPU/hybrid setups, and detailed guides to maximize LLM performance on your hardware.
Nim api-wrapper for llm-inference chatllm.cpp
Privacy-focused RAG chatbot for network documentation. Chat with your PDFs locally using Ollama, Chroma & LangChain. CPU-only, fully offline.
Simple bot that transcribes Telegram voice messages. Powered by go-telegram-bot-api & whisper.cpp Go bindings.
rust api wrapper for llm-inference chatllm.cpp
lua api wrapper for llm-inference chatllm.cpp
gemma-2-2b-it int8 cpu inference in one file of pure C#
kotlin api wrapper for llm-inference chatllm.cpp
Add a description, image, and links to the cpu-inference topic page so that developers can more easily learn about it.
To associate your repository with the cpu-inference topic, visit your repo's landing page and select "manage topics."