Multi-Agent RAG System with Self-Verification and Self-Correction
A production-ready Retrieval-Augmented Generation (RAG) system powered by LangGraph, featuring intelligent document processing, hybrid retrieval, and multi-agent verification.
- Multi-Agent System: Specialized agents for relevance, research, and verification
- Self-Verification: Automated fact-checking against source documents
- Self-Correction: Retry mechanism for unverified answers (max 1 retry)
- Hybrid Retrieval: BM25 + Vector search with customizable weights (0.4/0.6)
- Document Processing: Powered by Docling (OCR, tables, markdown conversion)
- Smart Caching: SHA-256 based content caching with expiration
- LangGraph Orchestration: State machine workflow with visual graph
- ✅ 66 tests with 93% coverage
- ✅ Type-safe with MyPy
- ✅ Linted with Ruff
- ✅ Pre-commit hooks
- ✅ Structured logging
- ✅ Environment-based config
- ✅ Error handling & retries
- PDF (with OCR)
- DOCX
- TXT
- Markdown
┌─────────────────────────────────────────────────────────────┐
│ Veritas RAG System │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Relevance │ │ Research │ │ Verification │ │
│ │ Agent │──▶│ Agent │──▶│ Agent │ │
│ │ (GPT-4o-mini)│ │ (Claude) │ │ (GPT-4o-mini)│ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │ │ │ │
│ └───────────────────┴───────────────────┘ │
│ │ │
│ ┌────────▼────────┐ │
│ │ LangGraph │ │
│ │ Orchestrator │ │
│ └────────┬────────┘ │
│ │ │
│ ┌────────▼────────┐ │
│ │ Hybrid Retriever│ │
│ │ (BM25+Vector) │ │
│ └────────┬────────┘ │
│ │ │
│ ┌────────▼────────┐ │
│ │ Document │ │
│ │ Processor │ │
│ │ (Docling) │ │
│ └─────────────────┘ │
└─────────────────────────────────────────────────────────────┘
| Component | Technology | Purpose |
|---|---|---|
| Document Processor | Docling | PDF parsing, OCR, table extraction |
| Embeddings | OpenAI text-embedding-3-small |
Vector representations |
| Vector Store | ChromaDB | Persistent vector storage |
| BM25 Retriever | LangChain | Keyword-based search |
| Hybrid Retriever | Custom | Weighted ensemble (0.4 BM25 + 0.6 Vector) |
| Relevance Agent | GPT-4o-mini | Question-document relevance |
| Research Agent | GPT-4o (or Claude Sonnet 4) | Answer generation |
| Verification Agent | GPT-4o-mini | Fact-checking |
| Orchestrator | LangGraph | State machine workflow |
| Frontend | Streamlit | Web interface |
graph TD
START([START]) --> relevance[Relevance Check]
relevance -->|CAN_ANSWER| research[Research/Answer Generation]
relevance -->|NO_MATCH/PARTIAL| finalize[Finalize]
research --> verify[Verify Answer]
verify -->|YES| finalize
verify -->|NO + retry < max| increment[Increment Retry]
verify -->|NO + retry >= max| finalize
increment --> research
finalize --> END([END])
style START fill:#90EE90
style END fill:#FFB6C1
style research fill:#87CEEB
style verify fill:#DDA0DD
style finalize fill:#F0E68C
-
Relevance Check (
CAN_ANSWER|PARTIAL|NO_MATCH)- Filters irrelevant questions early
- Saves API calls on off-topic queries
-
Research (if relevant)
- Generates factual answer from retrieved documents
- Uses GPT-4o or Claude Sonnet 4
-
Verification
- Fact-checks answer against source documents
- Returns
YES|NOwith detailed report
-
Self-Correction (if verification fails)
- Increments retry counter
- Regenerates answer (max 1 retry)
- Routes to finalize if max retries reached
-
Finalize
- Sets confidence level:
HIGH→ Verified answerMEDIUM→ Unverified but best effort (after retries)LOW→ Cannot answer (no relevant docs)
- Sets confidence level:
- Python 3.11+
- OpenAI API key
- Anthropic API key (for Claude)
git clone https://github.com/Richard-GOZAN/veritas-rag.git
cd veritas-ragcurl -LsSf https://astral.sh/uv/install.sh | shuv syncCreate .env file based on .env.example
uv run streamlit run app.pyOpen browser at http://localhost:8501
Switch research model in .env:
# Use Claude (better reasoning, higher cost)
RESEARCH_MODEL=claude-sonnet-4-20250514
RESEARCH_PROVIDER=anthropic
# Use GPT-4o (faster, lower cost)
RESEARCH_MODEL=gpt-4o
RESEARCH_PROVIDER=openaiAdjust BM25 vs Vector balance:
# More keyword-focused (better for exact matches)
HYBRID_RETRIEVER_WEIGHTS=[0.6, 0.4]
# More semantic (better for concepts)
HYBRID_RETRIEVER_WEIGHTS=[0.3, 0.7]
# Balanced (default)
HYBRID_RETRIEVER_WEIGHTS=[0.4, 0.6]# Keep cache for 30 days
CACHE_EXPIRE_DAYS=30
# Disable cache (always reprocess)
CACHE_EXPIRE_DAYS=0veritas-rag/
├── src/
│ ├── agents/ # Multi-agent system
│ │ ├── base.py # Base agent with retry logic
│ │ ├── relevance_agent.py
│ │ ├── research_agent.py
│ │ └── verification_agent.py
│ ├── config/ # Settings & constants
│ │ ├── constants.py
│ │ └── settings.py
│ ├── processors/ # Document processing
│ │ └── document.py # Docling + caching
│ ├── retriever/ # Hybrid retrieval
│ │ └── builder.py # BM25 + Vector
│ ├── workflow/ # LangGraph orchestration
│ │ ├── graph.py # Workflow logic
│ │ └── state.py # State definitions
│ └── utils/
│ └── logger.py # Structured logging
├── tests/
│ ├── unit/ # 57 unit tests
│ └── integration/ # 9 integration tests
├── docs/ # Documentation
│ ├── workflow_graph.mmd # Mermaid diagram
│ └── workflow_graph.png # Rendered diagram
├── scripts/
│ └── visualize_workflow.py
├── app.py # Streamlit interface
├── pyproject.toml # Dependencies & config
├── .env.example # Environment template
├── .pre-commit-config.yaml # Code quality hooks
└── README.md # Readme
uv run pytestuv run pytest --cov=src --cov-report=html# Unit tests only
uv run pytest tests/unit/
# Integration tests only
uv run pytest tests/integration/
# Specific module
uv run pytest tests/unit/test_workflow.pyuv run mypy src/uv run ruff check src/
uv run ruff format src/- Docling - Advanced document parsing
- LangChain - RAG framework
- LangGraph - Workflow orchestration
- Anthropic Claude - Research generation
- OpenAI GPT - Relevance & verification
- ChromaDB - Vector storage
MIT License - see LICENSE file for details
Built with ❤️ using LangGraph, Claude, and GPT-4o