A real-time hallucination detection system that verifies LLM responses against live web sources and auto-corrects using Retrieval-Augmented Generation (RAG).
- Model:
Shreyash03Chimote/Hallucination_Detection - Type: CrossEncoder (NLI β Natural Language Inference)
- Hosted on: HuggingFace π€ (no download needed β loaded automatically via
sentence-transformers) - Task: Given a (context, claim) pair β predicts Entailment / Contradiction / Neutral
- Chat Model:
smollm2:360mvia Ollama - RAG Model:
llama3.2:latestvia Ollama - Local inference β no API key required for the LLM
- Model:
nomic-embed-textvia Ollama - Stored in: Pinecone vector database
β οΈ No model weights need to be downloaded manually. All models load automatically on first run.
This project uses a live RAG pipeline for query processing:
| Component | Source |
|---|---|
| Web context | SerpAPI β real-time Google search results |
| Web content | Scraped via langchain's WebBaseLoader |
| Vector index | Pinecone β rebuilt per query (ephemeral namespace) |
| Chat logs | MySQL (rag_app.chat_logs) |
For training and testing hallucination detection classifiers, this project uses the HalluRAG Dataset (pickle format):
Dataset Details:
- Name: HalluRAG - Detecting Closed-Domain Hallucinations in RAG Applications
- Size: 19,731 validly annotated sentences
- Source: Wikipedia articles (recent updates after Feb 22, 2024 cutoff)
- Models Used: LLaMA-2-7B, LLaMA-2-13B, Mistral-7B with quantizations (float8, int8, int4)
- Contents:
- β RAG prompts (answerable & unanswerable questions)
- β LLM-generated responses
- β Internal states (contextualized embedding vectors, intermediate activation values)
- β Hallucination labels (binary classification)
- Download: DOI: 10.17879/84958668505
- Code: GitHub: F4biian/HalluRAG
Usage in this Project: The HalluRAG dataset (in pickle format) can be used to train MLP classifiers for sentence-level hallucination detection. The trained models analyze LLM internal states to predict whether a generated sentence is hallucinated, achieving test accuracies up to 75% (Mistral-7B).
Citation:
Ridder, F., & Schilling, M. (2024).
The HalluRAG Dataset: Detecting Closed-Domain Hallucinations in RAG Applications
Using an LLM's Internal States. arXiv preprint arXiv:2412.17056v1
If you want to test with a fixed dataset, you can pre-populate the Pinecone index manually using the vector store utilities in backend/server.py.
hallucination-rag/
βββ backend/ # Python Flask AI backend
β βββ server.py # Main Flask app + RAG + hallucination detection
β βββ config.py # Model/threshold configuration
β βββ requirements.txt # Python dependencies
β
βββ api/ # Node.js MySQL REST API
β βββ server.js # Express server (port 3001)
β βββ db.js # MySQL connection
β
βββ frontend/ # Static HTML/JS UI
β βββ index.html # Main chat app + integrated welcome hero
β βββ welcome.html # Welcome page design (reference)
β βββ public/ # Favicons, web manifest
β
βββ docs/ # Documentation & architecture
β βββ README.md # Project documentation
β βββ plan.md # Technical planning notes
β βββ DockerPlan.md # Docker setup (archived)
β βββ Flow_of_rag/ # Architecture diagrams & flow charts
β βββ screenshots/ # UI screenshots & demos
β
βββ scripts/ # Helper shell scripts
β βββ start_backend.sh # Start Flask server
β βββ cleanup.sh # Stop & cleanup processes
β βββ init-ollama.sh # Initialize Ollama models
β
βββ model/ # Embeddings & model storage
β βββ (generated on first run)
β
βββ START.sh # Main startup script (all services)
βββ QUICK_START.md # Setup & execution guide
βββ TEST_CASES_0_HALLUCINATION.md # Test suite (0% hallucination cases)
βββ .env # Environment variables (user-created from .env.example)
βββ .env.example # Template for API keys & configuration
βββ .gitignore # Git ignore rules
βββ package.json # Node.js dependencies
βββ package-lock.json # Locked dependency versions
βββ README.md # This file
βββ image-*.png # Screenshots for documentation
| Tool | Version | Purpose |
|---|---|---|
| Python | 3.10+ | Flask backend |
| Node.js | 18+ | MySQL API |
| Ollama | Latest | Local LLM + embeddings |
| MySQL | 8.0+ | Chat log persistence |
| SerpAPI key | β | Web search |
| Pinecone key | β | Vector database |
| HuggingFace token | β | CrossEncoder model |
π For fastest setup, see QUICK_START.md or run:
chmod +x START.sh
./START.shThis will automatically set up all services (Ollama, Flask backend, Node.js API, frontend) in tmux or provide instructions for manual terminal setup.
git clone https://github.com/<your-username>/hallucination-rag.git
cd hallucination-ragcp .env.example .env
# Fill in your API keys (see .env.example for all required keys)ollama pull smollm2:360m
ollama pull llama3.2:latest
ollama pull nomic-embed-textCREATE DATABASE rag_app;
USE rag_app;
CREATE TABLE chat_logs (
id INT AUTO_INCREMENT PRIMARY KEY,
query TEXT NOT NULL,
llm_response TEXT,
rag_response TEXT,
is_hallucinated TINYINT(1) DEFAULT 0,
hallucination_score FLOAT,
classification VARCHAR(50),
sentence_count INT DEFAULT 0,
sources_count INT DEFAULT 0,
sources JSON,
model_id VARCHAR(150),
response_time_ms INT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);python -m venv venv
source venv/bin/activate
pip install -r backend/requirements.txt
python backend/server.py
# β Running at http://127.0.0.1:8080npm install
node api/server.js
# β Running at http://127.0.0.1:3001Open frontend/index.html in VS Code with Live Server, or visit:
http://127.0.0.1:5500
| Method | Path | Description |
|---|---|---|
GET |
/api/health |
Health check |
GET |
/api/chat/stream?q=<query> |
SSE stream β LLM + RAG + hallucination |
GET |
/api/models |
Currently configured model names |
| Method | Path | Description |
|---|---|---|
POST |
/api/save |
Save a chat log to MySQL |
GET |
/api/history |
Fetch last 100 chat logs |
DELETE |
/api/history/:id |
Delete a specific log |
See .env.example for the full list. Required:
SERPAPI_API_KEY=your_serpapi_key
PINECONE_API_KEY=your_pinecone_key
HF_API_TOKEN=your_huggingface_token
OLLAMA_BASE_URL=http://localhost:11434| Layer | Technology |
|---|---|
| LLM (Chat) | Ollama (llama3.2:latest) |
| LLM (RAG) | Ollama (gemma:2b) |
| Embeddings | Ollama (nomic-embed-text) |
| Vector DB | Pinecone |
| Web Search | SerpAPI |
| Hallucination Detection | HuggingFace CrossEncoder (NLI) |
| Backend (AI) | Python / Flask / LangChain |
| Backend (Logging) | Node.js / Express |
| Database | MySQL 8 |
| Frontend | Vanilla HTML / CSS / JS |






