Deep Research Agent — a FastAPI + LangGraph pipeline that converts a research topic into a cited Markdown report via multi-step web search, LLM fact extraction, ChromaDB RAG, and structured synthesis.
flowchart LR
User --> QueryGenerator
QueryGenerator --> TavilySearcher
TavilySearcher --> FactExtractor
FactExtractor --> GapPlanner
GapPlanner --> TavilySearcherWave2[TavilySearcher (wave 2)]
TavilySearcherWave2 --> FactExtractorWave2[FactExtractor (wave 2)]
FactExtractorWave2 --> ChromaDBRAG
ChromaDBRAG --> ReportWriter[ReportWriter (LLM)]
ReportWriter --> CitedMarkdownReport
- Multi-pass depth: Supports a 2-pass search approach (gap planning and follow-up searching).
- SSE streaming: Real-time progress updates sent to the client via Server-Sent Events.
- SQLite job history: Keeps a local record of executed research jobs (
data/jobs.sqlite). - Export formats: Outputs reports in Markdown (MD), HTML, and DOCX formats.
- Q&A on report: Allows follow-up questions and answers based on the generated report.
- TL;DR: Optionally generates a short summary of the findings.
- Translation: Supports translating the final report.
- Counterarguments: Generates opposing perspectives and limitations.
- Executive summary: A concise overview of the core topic before the detailed findings.
- Heuristic mode: Can be run without an LLM (
LLM_PROVIDER=heuristic), relying solely on templates and snippets. - Ollama support: Run locally and privately using an Ollama LLM setup.
- Docker deployment: Fully containerized with a simple
docker-compose.ymlfor instant startup. - ChromaDB RAG: Uses a local vector store and
sentence-transformersto fetch relevant facts for LLM synthesis (ENABLE_RAG=1).
You can run the application in three ways. First, copy .env.example to .env and fill in your API keys.
1. Docker (one command):
docker compose up2. Python local:
pip install -r requirements.txt
python main.py serve3. CLI:
python main.py "quantum computing"| Variable | Description |
|---|---|
LLM_PROVIDER |
anthropic (default), heuristic (no LLM, only Tavily), or ollama (local). |
ANTHROPIC_API_KEY |
Your Anthropic Claude API key. |
TAVILY_API_KEY |
Your Tavily search API key (required for web searches). |
OLLAMA_MODEL |
Local Ollama model name (e.g., llama3.2). |
OLLAMA_BASE_URL |
Local Ollama API URL (e.g., http://127.0.0.1:11434). |
ENABLE_SUGGEST_TOPICS |
Set 1 to enable dynamic topic suggestions on the frontend. |
ENABLE_EXECUTIVE_SUMMARY |
Set 1 to prepend an executive summary to the report. |
ENABLE_ASK |
Set 1 to enable Q&A functionality. |
ENABLE_COUNTERARGUMENTS |
Set 1 to enable counterargument generation. |
ENABLE_TLDR |
Set 1 to generate a TL;DR section. |
ENABLE_TRANSLATE |
Set 1 to enable translation endpoints. |
ENABLE_JOB_DB |
Set 1 to save jobs to SQLite. Set 0 to disable history. |
JOB_DB_PATH |
Path to the SQLite database (e.g., ./data/jobs.sqlite). |
RATE_LIMIT_PER_MIN |
Rate limit for POST /research per IP address. 0 disables limits. |
ENABLE_RAG |
Set 1 to enable ChromaDB + sentence-transformers vector store RAG. |
The application is built around a dynamic LangGraph state machine. It takes a user topic and first translates it into specific search queries. The engine then uses Tavily to perform parallel web searches. Facts are extracted from the retrieved HTML and deduplicated using a Jaccard similarity scoring function to ensure maximum uniqueness of claims.
If the user sets a two-pass depth, the engine activates a gap-planner that reviews the currently extracted facts and generates follow-up queries to find missing context or counter-perspectives. These new queries are passed to a second wave of Tavily searches and fact extraction. Finally, the collected facts are embedded into an ephemeral ChromaDB vector store for Retrieval-Augmented Generation (RAG). The Anthropic Claude 3.5 Sonnet LLM then retrieves the most relevant subset of these facts and synthesizes a highly structured, fully cited Markdown report. In heuristic mode, no LLM is used; the application relies strictly on search snippets and templating logic.
Run the standard pytest suite:
python -m pytest -vGenerate a test coverage report:
python -m pytest --cov=. --cov-report=term-missingMIT License. See LICENSE for details.