Agentic RAG System with LangGraph

An intelligent conversational AI system that enables natural language querying of SQL databases with multi-format output generation. Built using LangGraph for workflow orchestration and Azure OpenAI for natural language processing.

🎯 Key Capabilities

Natural Language to SQL: Convert plain English queries to optimized SQL
Multi-format Output: Automatic generation of summaries, tables, and visualizations
Intelligent Routing: Smart detection of query intent and optimal response format
Production Ready: Robust error handling, logging, and performance optimization

✨ Features

Natural Language Processing: Convert plain English to optimized SQL queries
Multi-format Output: Automatic generation of summaries, tables, and visualizations
Intelligent Routing: Smart detection of query intent and optimal output format
Database Optimization: Single connection reuse for improved performance
Comprehensive Analysis: Support for complex multi-part analytical queries
Data Visualization: Automatic chart generation with matplotlib/seaborn
Robust Error Handling: Timeout management and query validation
Extensive Logging: Complete execution tracking and debugging support

Setup

Prerequisites

Your own CSV or Excel data file with business/sales data
Recommended columns: date, customer_id, order_value, category, product_name, etc.
Azure OpenAI API access

1. Create Virtual Environment

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

2. Install Dependencies

pip install -r requirements.txt

Or install in development mode:

pip install -e .

3. Environment Configuration

Copy the example environment file and configure your settings:

cp .env.example .env

Edit .env with your Azure OpenAI credentials:

AZURE_OPENAI_API_KEY=your-azure-openai-api-key-here
AZURE_OPENAI_BASE_URL=https://YOUR-RESOURCE-NAME.openai.azure.com/
AZURE_OPENAI_MODEL=gpt-5-preview

4. Verify Installation

Run the setup validation script:

python validate_setup.py

This will check your environment configuration and API connectivity.

5. Data Setup

Prepare Your Data:

Place your CSV or Excel file in the project root directory
Ensure your data has columns like: date, customer_id, order_value, category, etc.
Update the file path in import_data.py if needed

Import Your Data:

python import_data.py

This script will:

Read your CSV/Excel file
Create a SQLite database with appropriate schema
Import and structure your data for querying

6. Project Structure

├── langgraph_sql_agent/           # Core system implementation
│   ├── core/                      # Workflow orchestration
│   ├── database/                  # Database management
│   ├── llm/                       # AI model integration
│   ├── nodes/                     # Processing components
│   ├── output/                    # Generated visualizations
│   └── utils/                     # Configuration and utilities
├── main.py                        # Testing and demonstration script
├── interactive_query.py           # Interactive CLI interface
├── import_data.py                 # Database setup script
├── validate_setup.py              # Environment validation
├── requirements.txt               # Python dependencies
├── your_data.csv                  # Your CSV/Excel data (user provided)
├── database.db                    # Generated SQLite database
├── README.md                      # Project documentation
├── PORTFOLIO.md                   # Portfolio overview
└── PROJECT_REFERENCE_GUIDE.md     # Technical reference

🚀 Usage

Quick Start

Run the main application:

python main.py

Interactive Query Mode

For interactive querying:

python interactive_query.py

Running Tests

Execute the comprehensive test suite:

python test_optimized_6_prompts.py

Example Queries

The system supports various types of natural language queries (adapt to your data):

Simple Summaries:

"What was the total sales for this year?"
"How many records are in the database?"

Data Tables:

"Show me the top 10 customers by value"
"List all items by category"

Visualizations:

"Generate a trend plot over time"
"Create a bar chart by category"

Complex Multi-format Analysis:

"Generate a comprehensive analysis with charts and tables"
"Analyze patterns including visualizations and summaries"

Output Formats

The system automatically detects the best output format:

Summary: Text-based analysis and insights
Table: Structured data in tabular format
Plot: Visual charts and graphs (PNG files saved to langgraph_sql_agent/output/)
Multi: Combination of summary, table, and visualization

📋 Requirements

Python: 3.9+ (tested with 3.10)
Database: SQLite (default) or PostgreSQL
API Access: Azure OpenAI GPT-5 (configured in .env)
Memory: Minimum 4GB RAM recommended
Storage: ~100MB for dependencies + data

🔧 Configuration

Environment Variables

Key configuration options in .env:

# Azure OpenAI (Required)
AZURE_OPENAI_API_KEY=your-api-key
AZURE_OPENAI_BASE_URL=https://your-resource.openai.azure.com/
AZURE_OPENAI_MODEL=gpt-5-preview

# Performance Tuning
MAX_QUERY_TIMEOUT=300  # Increased for complex queries
MAX_RESULT_ROWS=10000

# Database
DATABASE_URL=sqlite:///database.db

Performance Optimizations

The system includes several optimizations:

Connection Reuse: Single database connection for all queries
Timeout Management: Extended timeouts for complex analysis
Quote Normalization: Handles Unicode smart quotes from GPT-5
SQL Validation: Prevents ORDER BY syntax errors in UNION queries

🧪 Testing

Run the comprehensive test suite:

python main.py

This will execute various test queries and generate:

Performance metrics and success rates
Sample outputs in multiple formats (text, tables, charts)
Detailed execution logs and metadata

All test outputs are saved to test_results/ directory.

🛠️ Development

Project Structure

The system uses a modular LangGraph workflow:

Intent Parser: Analyzes query intent and requirements
SQL Generator: Creates optimized SQL queries
Database Executor: Executes queries with connection reuse
Output Router: Determines optimal output format(s)
Format Generators: Creates summaries, tables, and visualizations
Multi-output Coordinator: Manages complex multi-format responses

Adding New Features

To extend functionality:

Add new nodes in langgraph_sql_agent/nodes/
Update the workflow in langgraph_sql_agent/core/workflow.py
Add tests in test_optimized_6_prompts.py
Update configuration in langgraph_sql_agent/utils/config.py

📊 Performance

The system is optimized for production use with:

Database Connection Reuse: 90%+ performance improvement
Query Optimization: Intelligent SQL generation and validation
Memory Efficient: Streaming results for large datasets
Error Recovery: Graceful handling of edge cases and timeouts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agentic RAG System with LangGraph

🎯 Key Capabilities

✨ Features

Setup

Prerequisites

1. Create Virtual Environment

2. Install Dependencies

3. Environment Configuration

4. Verify Installation

5. Data Setup

6. Project Structure

🚀 Usage

Quick Start

Interactive Query Mode

Running Tests

Example Queries

Output Formats

📋 Requirements

🔧 Configuration

Environment Variables

Performance Optimizations

🧪 Testing

🛠️ Development

Project Structure

Adding New Features

📊 Performance

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
langgraph_sql_agent		langgraph_sql_agent
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
import_data.py		import_data.py
interactive_query.py		interactive_query.py
main.py		main.py
requirements.txt		requirements.txt
validate_setup.py		validate_setup.py

Folders and files

Latest commit

History

Repository files navigation

Agentic RAG System with LangGraph

🎯 Key Capabilities

✨ Features

Setup

Prerequisites

1. Create Virtual Environment

2. Install Dependencies

3. Environment Configuration

4. Verify Installation

5. Data Setup

6. Project Structure

🚀 Usage

Quick Start

Interactive Query Mode

Running Tests

Example Queries

Output Formats

📋 Requirements

🔧 Configuration

Environment Variables

Performance Optimizations

🧪 Testing

🛠️ Development

Project Structure

Adding New Features

📊 Performance

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages