An enterprise-grade AI-powered analytics platform that transforms raw e-commerce data into actionable financial intelligence through parallel agent orchestration and advanced profitability analysis.
Profit Detective is a production-ready system that analyzes e-commerce operations to identify profit leaks, logistics inefficiencies, and customer retention opportunities. Built on LangGraph's state machine architecture, it orchestrates three specialized AI agents in parallel to deliver comprehensive CFO-level reports in under 90 seconds.
Core Capabilities:
- Automated multi-source data ingestion and normalization (9 CSV tables → unified SQLite schema)
- Parallel execution of 3 specialized analysis agents using LangGraph StateGraph
- AI-driven insight synthesis via OpenRouter (Qwen3-235B-A22B model)
- Professional PDF report generation with embedded visualizations
- Interactive Streamlit dashboard with real-time progress tracking
- Production-ready Docker deployment with health monitoring
Business Impact:
- Identifies freight cost inefficiencies consuming 12-18% of revenue
- Detects at-risk customer segments representing 15-25% of total LTV
- Quantifies margin erosion by product category and geographic region
- Projects ROI for operational improvements with specific R$ impact calculations
┌─────────────────────────────────────────────────────────────────────────┐
│ DATA INGESTION LAYER │
└─────────────────────────────────────────────────────────────────────────┘
│
┌───────────────────────────┴───────────────────────────┐
│ │
▼ ▼
┌──────────────────┐ ┌──────────────────┐
│ Python Processor│ │ Go Processor │
│ (Primary Path) │ │ (Optional Fast) │
│ │ │ │
│ • ThreadPool │ │ • Goroutines │
│ • Pandas │ │ • gRPC Server │
│ • 9 CSV → 1 DB │ │ • Concurrent I/O│
└────────┬─────────┘ └────────┬─────────┘
│ │
└─────────────────────┬───────────────────────────────┘
▼
┌─────────────────────┐
│ SQLite Database │
│ + Profiler JSON │
│ (Source of Truth) │
└──────────┬──────────┘
│
┌─────────────────────────────────────────────────────────────────────────┐
│ AGENT ORCHESTRATION LAYER │
│ (LangGraph StateGraph) │
└─────────────────────────────────────────────────────────────────────────┘
│
┌──────────────┼──────────────┐
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Agent A │ │ Agent B │ │ Agent C │
│ │ │ │ │ │
│ Logistics │ │ Retention │ │ SQL Expert │
│ Analyst │ │ Specialist │ │ │
│ │ │ │ │ │
│ • Freight │ │ • RFM │ │ • KPIs │
│ • Delivery │ │ • Cohorts │ │ • Profit │
│ • Geography │ │ • LTV │ │ • Payments │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
└────────────────┼────────────────┘
▼
┌─────────────────────┐
│ Cross-Validation │
│ + Synthesis Node │
└──────────┬──────────┘
│
┌─────────────────────────────────────────────────────────────────────────┐
│ PRESENTATION LAYER │
└─────────────────────────────────────────────────────────────────────────┘
│
┌──────────────┴──────────────┐
▼ ▼
┌──────────────┐ ┌──────────────────┐
│ CFO Executive│ │ Streamlit │
│ PDF Report │ │ Dashboard │
│ │ │ │
│ • Matplotlib │ │ • Plotly Charts │
│ • fpdf2 │ │ • Live Progress │
│ • KPI Cards │ │ • FastAPI Server │
└──────────────┘ └──────────────────┘
| Component | Technology | Purpose |
|---|---|---|
| Data Ingestion | Python (pandas, ThreadPoolExecutor) | Concurrent CSV loading, cleaning, and denormalization |
| Alternative Processor | Go (goroutines, gRPC) | High-performance processing for large datasets (optional) |
| Database | SQLite | Unified master table with pre-computed metrics (43 columns) |
| Orchestration | LangGraph 0.6+ (StateGraph) | Stateful agent pipeline with parallel execution |
| AI Layer | OpenRouter API (Qwen3-235B-A22B) | Narrative synthesis and insight generation |
| Agent A | Logistics Analyst | Shipping delays, freight costs, geographic inefficiencies |
| Agent B | Retention Specialist | RFM segmentation, customer cohorts, LTV modeling |
| Agent C | SQL Expert | Profitability queries, KPI calculations, payment analysis |
| Reporting | fpdf2 + Matplotlib | Professional multi-page CFO reports with charts |
| Dashboard | Streamlit + Plotly | Interactive BI interface with real-time updates |
| API | FastAPI + Uvicorn | RESTful endpoints for concurrent analysis sessions |
| Containerization | Docker (multi-stage build) | Production deployment with health checks |
- 3 agents run concurrently (not sequentially) using LangGraph's fan-out/fan-in pattern
- 70-second total runtime vs 150+ seconds sequential
- All agents operate on the same SQLite database with zero lock contention (read-only queries)
- Automatic consistency checks ensure all agents agree on core metrics
- Revenue, freight, and customer counts validated across agents
- Discrepancies >1% trigger investigation flags
- Master table schema: 112,650 rows × 43 columns
- Pre-computed derived metrics:
freight_ratio,delivery_delta_days,is_late_delivery,contribution_margin - Proper handling of multi-item orders (deduplication of payment values)
- Timestamp parsing with timezone awareness
- LLM generates 4-6 actionable insights per agent
- Severity tagging:
[CRITICAL],[WARNING],[OPPORTUNITY] - Quantified impact statements with R$ amounts
- Deterministic fallback rules when LLM unavailable
- Multi-page PDF with corporate branding (navy/gold color scheme)
- KPI cards with benchmark comparisons
- Embedded Matplotlib charts (180 DPI)
- Executive summary in boardroom-ready prose (300-400 words)
- Docker multi-stage build (lean runtime image)
- Non-root user execution
- Health check endpoints
- Environment-based configuration (no secrets in code)
- Graceful error handling and logging
- Python 3.10 or 3.11 (tested on 3.10+, 3.11 recommended)
- OpenRouter API Key (get one at https://openrouter.ai/keys)
- Docker (optional, for containerized deployment)
- Go 1.21+ (optional, for high-performance processor)
-
Clone the repository:
git clone <repository-url> cd ZZ
-
Create virtual environment:
python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Configure environment:
cp .env.example .env # Edit .env and add your OPENROUTER_API_KEY -
Verify data files:
ls -lh data/ # Should show 9 CSV files totaling ~126 MB
-
Configure environment:
cp .env.example .env # Add your OPENROUTER_API_KEY to .env -
Build and run:
docker compose up --build
-
Access dashboard:
http://localhost:8501
The Docker setup includes:
- Persistent storage for SQLite, PDFs, and charts
- Memory limits (2GB max, 512MB reserved)
- Health monitoring (every 30s)
- Automatic restart on failure
Run the complete analysis pipeline (ingestion → agents → report):
python run.pyOutput:
╔═══════════════════════════════════════════════╗
║ E-COMMERCE PROFIT DETECTIVE — Pipeline ║
╚═══════════════════════════════════════════════╝
▶ Starting Data Processor...
✓ customers 99,441 rows 5 cols
✓ order_items 112,650 rows 7 cols
✓ orders 99,441 rows 8 cols
...
▶ Starting Agentic Analysis Pipeline...
═══ Agent A — Logistics ═══
═══ Agent B — Retention ═══
═══ Agent C — SQL Expert ═══
═══ Cross-Validation ═══
═══ Synthesis ═══
▶ Generating CFO Executive Report...
╔═══════════════════════════════════════════════╗
║ PIPELINE COMPLETE ║
║ Elapsed: 72.3s ║
║ Report : output/cfo_executive_report.pdf ║
╚═══════════════════════════════════════════════╝
Launch the Streamlit dashboard for interactive analysis:
python run.py --port 8501
# Or directly:
streamlit run dashboard/app.pyDashboard Features:
- Run Analysis button triggers pipeline with live progress bar
- Executive Dashboard section with 8 KPI cards
- Logistics Analysis with state freight heatmap
- Retention Intelligence with RFM segment breakdown
- Profitability View with category margin charts
- Download PDF button for the generated report
The dashboard includes a FastAPI backend for running multiple analyses simultaneously:
# FastAPI endpoints:
POST /api/run-analysis # Start new analysis session
GET /api/sessions/{session_id}/status # Check progress
GET /api/sessions/{session_id}/result # Retrieve final stateEach session gets isolated paths:
output/sessions/{session_id}/db.sqliteoutput/sessions/{session_id}/report.pdfoutput/sessions/{session_id}/charts/
The system processes the Olist Brazilian E-Commerce dataset (9 CSV files):
| Table | Rows | Key | Description |
|---|---|---|---|
customers |
99,441 | customer_unique_id |
Unique customer records with geographic info |
orders |
99,441 | order_id |
Order-level timestamps and status |
order_items |
112,650 | order_id, order_item_number |
Line items with product/seller/price/freight |
order_payments |
103,886 | order_id |
Payment types, installments, values |
order_reviews |
99,224 | order_id |
Review scores and timestamps |
products |
32,951 | product_id |
Product dimensions and category names |
sellers |
3,095 | seller_id |
Seller location info |
product_translations |
71 | product_category_name |
Portuguese → English mappings |
geolocation |
1,000,163 | geolocation_zip_code_prefix |
Lat/long for zip codes |
The ingestion pipeline creates a denormalized table at the order-item grain with 43 columns:
Core Fields:
order_id,order_item_number,customer_unique_id,seller_id,product_id
Financial:
price,freight_value,total_payment_value,freight_ratio,contribution_margin
Timestamps:
order_purchase_timestamp,order_delivered_customer_date,shipping_limit_date
Derived Metrics:
delivery_days,estimated_delivery_days,delivery_delta_days,shipping_processing_daysis_late_delivery(boolean flag)
Aggregated:
review_score,payment_type,max_installments
Geographic:
customer_state,customer_city,seller_state,seller_city
Product:
product_category_name_english,product_weight_g,product_length_cm,product_height_cm,product_width_cm
The system uses LangGraph StateGraph for stateful agent management:
# State definition
GlobalState = TypedDict("GlobalState", {
"profiler_schema": dict,
"logistics_report": dict,
"retention_report": dict,
"profitability_report": dict,
"business_anomalies": Annotated[List[str], add], # Reducer: concatenate
"sql_query_history": Annotated[List[str], add],
"executive_summary": str,
# ... more fields
})
# Graph topology
initialize → ├─ logistics ─┐
├─ retention ─┼─ cross_validate → synthesize → END
└─ sql_expert ─┘Key Design Decisions:
- Parallel execution: All 3 agents run concurrently (~70s total)
- Add reducers: Each agent returns only new anomalies; LangGraph merges them
- No double-writes: Initial state avoids pre-populating node-owned fields (prevents
INVALID_CONCURRENT_GRAPH_UPDATE) - Cross-validation node: Runs after all agents complete; verifies consistency
SQL-Heavy Analysis:
- Delivery performance: avg days, late %, freight as % of revenue
- State-level freight bleed (identifies states with above-average freight costs)
- Late delivery hotspots by category and state
- Processing bottlenecks by seller state
- Freight/weight correlation by category
Insight Examples:
[CRITICAL] Freight drains 16.8% of product revenue — R$2,251,909 out of
R$13,401,336. Each 1pp reduction saves R$134,013. Immediate action: renegotiate
carrier contracts (target 3% reduction = R$67,557).
[WARNING] Top freight-bleed states: RR (24.3%, R$12,405 freight, 23.1d avg delivery);
AP (22.7%, R$8,932 freight, 21.5d avg delivery). Combined freight in above-avg
states: R$487,203. Establish Northeast distribution hub to cut costs by 20-40%.
RFM Segmentation:
- Computes Recency, Frequency, Monetary scores for each customer
- Segments: Whale, Loyal, At-Risk, Lost, Casual
- Customer segmentation with revenue contribution analysis
- LTV metrics (avg, median, P90, P99)
- Cohort acquisition curves
Segment Definitions:
- Whale: F ≥ 3, M ≥ P90, R < 180 days → High-value repeat customers
- Loyal: F ≥ 2, M ≥ median, R < 180 days → Solid repeat base
- At-Risk: F ≥ 2, R > 180 days → Previous buyers going dormant
- Lost: F ≥ 2, R > 365 days → Churned repeat customers
- Casual: F = 1 → Single-purchase customers
Insight Examples:
[URGENT] 4,832 At-Risk customers with R$1,247,903 historical revenue at risk
of churning. 20% win-back rate = R$249,581 recoverable. Action: 3-wave sequence
with 15% coupon, SMS at Day 7, free shipping at Day 14.
[CRITICAL] Repeat purchase rate is 2.98% — 97.02% of customers are single-purchase.
This is a leaky bucket business model. Target: >20%. Estimated cost: R$154,262
per 1pp improvement.
Profitability Queries:
- KPI summary: total orders, customers, AOV, contribution margin
- Category profitability: revenue, margin %, freight costs
- AOV trends over time
- Payment method analysis (with installment data)
- Seller performance by state
Guardrails:
- All queries are read-only (enforced by
SafeSQLwrapper) - Column names validated against profiler JSON
- No subqueries or joins to external schemas
- Query timeout: 30 seconds
- Result limit: 10,000 rows
Insight Examples:
[OPPORTUNITY] AOV at R$159.85 across 96,096 delivered orders. A 5% lift adds
R$768,348 incremental revenue — equivalent to acquiring 4,652 new customers at
zero acquisition cost. Action: Free-shipping threshold at R$207.00 and bundle
promotions.
Contribution margin: 83.2% after freight consumes 16.8% of product revenue.
Renegotiating carrier contracts for the top-5 freight-heavy states could recover
2-3 margin points = R$268,027 annual savings.
Ensures internal consistency across agents:
| Check | Agents | Threshold |
|---|---|---|
| Total product revenue | Logistics vs SQL Expert | <1% variance |
| Total freight cost | Logistics vs SQL Expert | <1% variance |
| Customer count | Retention vs SQL Expert | <1% variance |
| Payment vs (Product+Freight) | SQL Expert internal | Document gap |
Validation Output:
✓ Product revenue cross-check PASSED: Logistics R$13,401,336 vs KPIs R$13,401,336 (diff: 0.00%)
✓ Freight cross-check PASSED: R$2,251,909 vs R$2,251,909
✓ Customer count cross-check PASSED: 93,358 vs 93,358
All settings are centralized in orchestrator/config.py and .env:
| Variable | Default | Description |
|---|---|---|
OPENROUTER_API_KEY |
(required) | API key from openrouter.ai |
OPENROUTER_MODEL |
qwen/qwen3-235b-a22b |
Model ID (see alternatives below) |
APP_PORT |
8501 |
Streamlit dashboard port |
LOG_LEVEL |
INFO |
Logging verbosity (DEBUG, INFO, WARNING) |
GRPC_HOST |
localhost |
Go processor gRPC host (optional) |
GRPC_PORT |
50051 |
Go processor gRPC port (optional) |
The system is tested with these OpenRouter models:
| Model | Cost | Speed | Quality | Notes |
|---|---|---|---|---|
qwen/qwen3-235b-a22b |
$$ | Moderate | Excellent | Default — best insights |
google/gemma-2-9b-it:free |
Free | Fast | Good | Budget option for testing |
anthropic/claude-3-opus |
$$$ | Slow | Excellent | Premium alternative |
openai/gpt-4 |
$$$ | Moderate | Excellent | Fallback option |
To change models:
# In .env
OPENROUTER_MODEL=google/gemma-2-9b-it:freeZZ/
├── .env # Environment variables (git-ignored)
├── .env.example # Template for environment setup
├── requirements.txt # Python dependencies
├── run.py # CLI entry point
├── Dockerfile # Multi-stage Docker build
├── docker-compose.yml # Docker orchestration config
├── orchestrator/
│ ├── __init__.py
│ ├── config.py # Centralized configuration
│ ├── main.py # Pipeline orchestrator
│ ├── data_processor.py # Python CSV processor
│ ├── safe_sql.py # Read-only SQL wrapper
│ ├── llm.py # OpenRouter API wrapper
│ ├── state.py # LangGraph GlobalState TypedDict
│ ├── graph.py # LangGraph StateGraph definition
│ ├── report_generator.py # PDF generator with charts
│ ├── session.py # Session management for concurrency
│ └── agents/
│ ├── __init__.py
│ ├── logistics_analyst.py # Agent A
│ ├── retention_specialist.py # Agent B
│ └── sql_expert.py # Agent C
├── dashboard/
│ └── app.py # Streamlit + FastAPI dashboard
├── data/ # Olist CSV files (9 files, 126MB)
├── output/ # Generated artifacts (git-ignored)
│ ├── ecommerce.db # SQLite database
│ ├── data_profiler.json # Schema source-of-truth
│ ├── cfo_executive_report.pdf # Generated report
│ ├── charts/ # Matplotlib PNGs
│ └── sessions/ # Multi-user session isolation
├── proto/
│ └── processor.proto # gRPC service definition (optional)
└── go_processor/ # Optional Go processor
├── go.mod
└── main.go
- Set
OPENROUTER_API_KEYin environment (never commit to git) - Configure
OPENROUTER_MODELfor cost/quality tradeoff - Mount
/app/datavolume for CSV files (read-only) - Mount
/app/outputvolume for persistence (read-write) - Set memory limits: 2GB max, 512MB reserved
- Enable health checks (HTTP GET
/_stcore/health) - Configure logging to external sink (e.g., CloudWatch, ELK)
- Set up TLS termination (reverse proxy: nginx, Caddy, Traefik)
- Implement authentication (Streamlit doesn't provide built-in auth)
# 1. Create .env with production secrets
cat > .env <<EOF
OPENROUTER_API_KEY=sk-or-v1-xxx...
OPENROUTER_MODEL=qwen/qwen3-235b-a22b
APP_PORT=8501
LOG_LEVEL=INFO
EOF
# 2. Deploy with docker compose
docker compose up -d
# 3. Verify health
curl http://localhost:8501/_stcore/health
# 4. View logs
docker compose logs -f app
# 5. Scale horizontally (if needed)
docker compose up -d --scale app=3Example manifest (abbreviated):
apiVersion: apps/v1
kind: Deployment
metadata:
name: profit-detective
spec:
replicas: 2
selector:
matchLabels:
app: profit-detective
template:
metadata:
labels:
app: profit-detective
spec:
containers:
- name: app
image: profit-detective:latest
ports:
- containerPort: 8501
env:
- name: OPENROUTER_API_KEY
valueFrom:
secretKeyRef:
name: openrouter-secret
key: api-key
resources:
limits:
memory: "2Gi"
cpu: "2000m"
requests:
memory: "512Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /_stcore/health
port: 8501
initialDelaySeconds: 90
periodSeconds: 30Tested on: Intel i7-10700K (8C/16T), 32GB RAM, NVMe SSD
| Phase | Time | Notes |
|---|---|---|
| CSV Ingestion (9 files) | 3.2s | Concurrent ThreadPoolExecutor |
| Data Cleaning & Join | 4.1s | 112,650 rows × 43 cols |
| SQLite Write | 0.8s | Indexed on order_id, customer_unique_id |
| Agent A (Logistics) | 22.3s | 6 SQL queries, LLM synthesis |
| Agent B (Retention) | 24.7s | RFM scoring, segmentation logic |
| Agent C (SQL Expert) | 19.8s | 7 SQL queries, LLM synthesis |
| Cross-Validation | 0.4s | 4 consistency checks |
| Synthesis (LLM) | 8.9s | Executive summary generation |
| PDF Generation | 6.2s | 7 charts + 5-page PDF |
| Total Pipeline | 72.3s | (Agents run in parallel) |
Scaling Characteristics:
- Linear scaling with row count (up to ~500K rows tested)
- LLM latency dominates for small datasets (<10K rows)
- Go processor 2.3× faster for ingestion (cold start: 1.4s for 9 files)
1. OPENROUTER_API_KEY not set warning
Solution:
# Add to .env file
echo "OPENROUTER_API_KEY=sk-or-v1-your-key-here" >> .env2. Failed to connect to OpenRouter error
Causes:
- Network connectivity issues
- Invalid API key
- Rate limiting (429 status)
Solution:
# Check API key validity
curl https://openrouter.ai/api/v1/auth/key \
-H "Authorization: Bearer $OPENROUTER_API_KEY"
# Use free model for testing
export OPENROUTER_MODEL=google/gemma-2-9b-it:free3. sqlite3.OperationalError: database is locked
Cause: Multiple processes writing to same DB
Solution:
# Use session-based isolation (automatic in dashboard)
# Or delete output/ecommerce.db and re-run
rm output/ecommerce.db
python run.py4. Memory errors during data processing
Cause: Insufficient RAM for pandas joins
Solution:
# Increase Docker memory limit
# In docker-compose.yml:
deploy:
resources:
limits:
memory: 4g # Increase from 2g5. Charts not appearing in PDF
Cause: Missing output/charts/ directory or permissions
Solution:
mkdir -p output/charts
chmod 755 output/chartsStart a new analysis session.
Response:
{
"session_id": "20260326_143052_a3f118bc",
"status": "running",
"message": "Analysis started"
}Check analysis progress.
Response:
{
"session_id": "20260326_143052_a3f118bc",
"status": "running",
"progress": 45,
"current_step": "Agent B: Retention analysis running...",
"elapsed_seconds": 32.1
}Retrieve final analysis results.
Response:
{
"session_id": "20260326_143052_a3f118bc",
"status": "complete",
"elapsed_seconds": 72.3,
"logistics_report": {...},
"retention_report": {...},
"profitability_report": {...},
"executive_summary": "...",
"business_anomalies": [...],
"pdf_path": "output/sessions/20260326_143052_a3f118bc/report.pdf"
}This is a production system. Contributions should maintain the following standards:
-
Code Quality:
- Type hints for all function signatures
- Docstrings for classes and public methods
- Unit tests for business logic (target: 80% coverage)
-
Performance:
- No unnecessary synchronous operations in agent code
- SQL queries must be indexed-optimized
- LLM calls cached where deterministic
-
Backward Compatibility:
- Database schema changes require migration scripts
- API endpoint changes require versioning (
/api/v2/...)
-
Documentation:
- Update README for feature additions
- Inline comments for complex logic
- Changelog entries for all PRs
Proprietary — All Rights Reserved.
This software is confidential and proprietary. Unauthorized copying, distribution, or use is strictly prohibited.
- Dataset: Olist Brazilian E-Commerce Public Dataset (Creative Commons)
- LangGraph: State machine framework by LangChain
- OpenRouter: Unified API for LLM access
For technical issues or questions:
- Create an issue in the repository issue tracker
Version: 1.0 Last Updated: March 2026 Build Status: Production Ready ✓