A ready-to-run RAG simulation with:
- LangChain tool calling (OpenAI)
- LangGraph A2A (Planner → Executor → Verifier)
- Hybrid retrieval (BM25 + Chroma vector search) + cross-encoder re-ranking
- Citations + confidence scoring (hallucination reduction)
- SQL tool (SQLite) + Dummy API tools (FastAPI)
- Email notification (Gmail SMTP)
- Evaluation: local offline metrics + optional LangSmith (if configured)
- Deployable: Docker + docker-compose + GitHub Actions CI
retail_rag_sim/
src/retail_rag_sim/ # python package
data/docs/ # markdown KB docs
data/seed.sql # demo retail DB seed
api/dummy_api.py # dummy tools API (FastAPI)
ui/app.py # Streamlit UI
tests/ # basic tests
docker/ # Dockerfile + docker-compose
Note: This repo intentionally avoids the
langchainmeta-package to reduce breakage from packaging changes. It useslangchain-core,langchain-community, andlangchain-openai.
- Python 3.10+
- An OpenAI API key (this project is OpenAI-only)
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -U pip
pip install -r requirements.txt
pip install -e .pip install -e .export PYTHONPATH=./src # Windows (PowerShell): $env:PYTHONPATH="./src"cp .env.example .env
# Edit .env and set OPENAI_API_KEYpython -m retail_rag_sim.tools.seed_dbpython -m retail_rag_sim.retrieval.ingestuvicorn api.dummy_api:app --reload --port 8001In a new terminal (same venv + PYTHONPATH):
streamlit run ui/app.pyOpen: http://localhost:8501
- “What is the return window for in-store pickup?”
- “Can I designate someone else to pick up my order?”
- “How much tax did I pay on order R-10002 and what was the total?”
- “What are store hours for ST-CHI-01?”
- “What appointment slots are available for ST-CHI-01 repair service?”
- Grounding: answers must rely on retrieved snippets + tool outputs
- Hybrid retrieval: BM25 + vector search
- Re-ranking: cross-encoder ranks top candidates
- Citations: sources listed in output
- Confidence scoring: verifier agent returns a confidence score and next action recommendation
Run:
python -m retail_rag_sim.eval.run_local_evalThis prints a simple metric summary over data/eval_examples.jsonl.
If you want LangSmith:
- Set in
.env:
LANGSMITH_TRACING=true
LANGSMITH_API_KEY=...
LANGSMITH_PROJECT=retail-rag-sim
- Run:
python -m retail_rag_sim.eval.run_langsmith_evalThis is optional. Gmail typically requires an App Password.
Set in .env:
[email protected]
GMAIL_SMTP_APP_PASSWORD=xxxx xxxx xxxx xxxx
The agent can call the send_email tool when prompted (e.g., “Email me a pickup checklist”).
From the repo root:
cd docker
docker compose -f docker-compose.yml up --buildServices:
- Dummy API: http://localhost:8001
- UI: http://localhost:8501
For Docker usage, pass environment variables via an
.envfile ordocker composeenvironment settings.
GitHub Actions workflow:
- installs deps
- seeds DB
- runs tests
- runs ruff lint
File: .github/workflows/ci.yml
- Import errors: ensure
PYTHONPATH=./src - OpenAI auth: ensure
OPENAI_API_KEYis set - Vector ingest slow: first run downloads reranker model weights
- LangSmith errors: run local eval if you don’t set
LANGSMITH_API_KEY
- Add/replace KB docs in
data/docs/*.md - Add new tools in
src/retail_rag_sim/tools/and register inagents/graph.py - Extend DB schema + tool queries (still SELECT-only for governance)