Legal RAG is a Retrieval-Augmented Generation system for legal documents, providing accurate answers to questions based on legal contracts and documents.
- PDF document ingestion with intelligent section detection
- Vector database storage with semantic search
- FastAPI endpoints for both human-readable and structured JSON responses
- Ollama integration for local LLM inference
- Proper citation of sources in responses
- Python 3.10+
- Ollama installed and running
- The following Ollama models:
llama3.2(for text generation)mxbai-embed-large(for embeddings)
# Create virtual environment
uv venv
# Activate virtual environment
source .venv/bin/activate # Linux/macOS
# or
.venv\Scripts\activate # Windows
# Install dependencies from project file
uv pip install -e .
# Or install from requirements.txt
uv add -r requirements.txtCreate a .env file in the project root:
CHROMA_DB_DIR=./chroma_db
CHROMA_COLLECTION_NAME=legal_contracts
OLLAMA_EMBED_MODEL=mxbai-embed-large
OLLAMA_LLM_MODEL=llama3.2
OLLAMA_API_URL=http://localhost:11434
API_HOST=0.0.0.0
API_PORT=8000
DEFAULT_TOP_K=3
python scripts/ingest_pdfs.py --pdf-dir ./data/pdfsuvicorn main:app --reload --log-level=debug --host 0.0.0.0 --port 8000curl -X POST "http://localhost:8000/query/text" \
-H "Content-Type: application/json" \
-d '{"query": "What are the tax obligations in the asset purchase agreement?", "top_k": 3}'curl -X POST "http://localhost:8000/query/json" \
-H "Content-Type: application/json" \
-d '{"query": "What are the tax obligations in the asset purchase agreement?", "top_k": 3}'legalrag/
├── app/
│ ├── ingest/ # PDF ingestion and processing
│ ├── llm/ # LLM chain integration
│ ├── store/ # Vector database interface
│ ├── config.py # Application settings
│ ├── schemas.py # Pydantic models
│ └── main.py # FastAPI application
├── data/
│ └── pdfs/ # PDFs to be ingested
├── scripts/
│ └── ingest_pdfs.py # CLI tool for ingestion
└── main.py # Application entry point
- PDF Parser (
app/ingest/pdf_parser.py): Extracts text from PDFs and identifies logical sections. - Chunker (
app/ingest/chunker.py): Splits sections into smaller chunks for effective retrieval. - Ingest Module (
app/ingest/ingest.py): Coordinates the ingestion process.
- Chroma Store (
app/store/chroma_store.py): Manages document embeddings and retrieval using ChromaDB.
- RAG Chain (
app/llm/chain.py): Implements the retrieval-augmented generation pattern using Ollama.
- FastAPI App (
main.py): Provides endpoints for text and JSON responses. - Schemas (
app/schemas.py): Defines data models for requests and responses.
- The system automatically identifies sections in legal documents
- Citations include document ID, section, and page numbers
- JSON responses include confidence scores and recommended next actions
Add Evaluation and tracing via Ragas and Langfuse:
if langfuse_tracing_enabled is enabled, tracing will be enbled to store llm input and response for latency.
and to run evaluation:
# generate dataset for eval
python3 eval_main.py --create-sample --sample-output sample_questions.json
# run the evaluation
python3 eval_main.py --dataset sample_questions.json --output evaluation_results.json --top-k 3
