Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 58 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@ Beautiful theme with smooth transitions, localStorage persistence, and optimized
### 🧠 **Smart Memory**
Context-aware conversations that remember previous Q&A, instant answers from cache, 60-70% accuracy in complex queries.

### 🔑 **Bring Your Own API Key**
Use your own OpenAI API key for complete control over costs and usage. Keys are encrypted at rest with AES encryption and never stored on the server.

</td>
<td width="50%">

Expand Down Expand Up @@ -68,6 +71,15 @@ docker-compose up --build

## ✨ Core Features

### 🔑 User API Key Management
**Complete control over your OpenAI costs:**
- **Bring Your Own Key**: Use your personal OpenAI API key
- **Encrypted at Rest**: AES encryption with device-specific fingerprinting
- **Never Server-Stored**: Keys stay in your browser's localStorage
- **Zero Trust**: Server never persists your key, only uses it for requests
- **Easy Setup**: Configure via Settings modal in the UI
- **Fallback Support**: System can use environment key if user key not provided

### 🔍 Multi-Tool Search Agent
4 intelligent search tools that auto-select or run in parallel:
- **Vector Search** - Semantic understanding (5x/3x retrieval multipliers)
Expand Down Expand Up @@ -304,7 +316,7 @@ docker-compose up --build

```env
# API & Credentials
OPENAI_API_KEY=sk-proj-your-key
OPENAI_API_KEY=sk-proj-your-key # Optional: Can be provided by users via UI
POSTGRES_PASSWORD=your_password
NEO4J_AUTH=neo4j/your_password

Expand All @@ -318,28 +330,64 @@ CHUNK_OVERLAP=500
TOP_K_RESULTS=20
```

### User API Key Setup

Users can provide their own OpenAI API key through the UI:

1. **Click Settings Icon** (⚙️) in the top-right corner
2. **Enter Your OpenAI API Key** in the modal
3. **Save** - Key is encrypted and stored in browser localStorage
4. **Use the System** - All API calls use your key automatically

**Security Features:**
- 🔐 **AES Encryption** with browser fingerprint-based key derivation
- 🏠 **Client-Side Storage** - Keys never leave your browser
- 🔒 **Zero Server Persistence** - Backend receives keys via headers only
- 🔄 **Easy Management** - Clear/update key anytime via Settings

---

## 📝 API Examples

### With User-Provided API Key

```bash
# Upload document
# Upload document with your API key
curl -X POST http://localhost:8000/api/rag/upload \
-H "X-OpenAI-API-Key: sk-proj-your-key-here" \
-F "file=@document.pdf"

# Query with memory
# Query with your API key
curl -X POST http://localhost:8000/api/rag/query/stream \
-H "Content-Type: application/json" \
-H "X-OpenAI-API-Key: sk-proj-your-key-here" \
-d '{"query": "What is this about?", "document_id": "your-id"}'

# Update graph (natural language)
# Update graph with your API key
curl -X POST http://localhost:8000/api/rag/query/stream \
-H "Content-Type: application/json" \
-H "X-OpenAI-API-Key: sk-proj-your-key-here" \
-d '{"query": "Create AI node and connect to Python, ML", "document_id": "your-id"}'
```

### Without User API Key (uses system default)

```bash
# Upload document (uses env OPENAI_API_KEY)
curl -X POST http://localhost:8000/api/rag/upload \
-F "file=@document.pdf"

# Query without custom key
curl -X POST http://localhost:8000/api/rag/query/stream \
-H "Content-Type: application/json" \
-d '{"query": "What is this about?", "document_id": "your-id"}'

# Clear memory
curl -X DELETE http://localhost:8000/api/memory/clear
```

**Note:** The `X-OpenAI-API-Key` header is optional. If not provided, the system falls back to the `OPENAI_API_KEY` from environment variables.

Full API documentation: http://localhost:8000/docs

---
Expand Down Expand Up @@ -397,7 +445,12 @@ LYZR-Hackathon/
## 📜 Version History

### v2.0.0 (Current) - October 15, 2025
Complete dark mode • Smart memory system • Memory-first strategy • Enhanced UI with white text
**Major Features:**
- 🔑 **Bring Your Own API Key** - User-provided OpenAI keys with AES encryption
- 🌙 **Complete Dark Mode** - Beautiful theme with localStorage persistence
- 🧠 **Smart Memory System** - Memory-first strategy with instant cached responses
- 🎨 **Enhanced UI** - Pure white text in dark mode, improved markdown rendering
- 🔒 **Zero-Trust Security** - API keys encrypted at rest, never stored on server

### v1.2.0 - October 13, 2025
Natural language graph updates • 9 operations • Batch connections • Real-time refresh
Expand Down
56 changes: 45 additions & 11 deletions backend/app/api/rag_routes.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
"""
import logging
import os
from fastapi import APIRouter, UploadFile, File, HTTPException, Depends
from fastapi import APIRouter, UploadFile, File, HTTPException, Depends, Header
from fastapi.responses import StreamingResponse
from pydantic import BaseModel
from typing import Optional, List, Literal
Expand All @@ -21,6 +21,7 @@
from app.services.graphrag_pipeline import GraphRAGPipeline
from app.services.elasticsearch_service import elasticsearch_service
from app.utils.document_helpers import process_document_pipeline
from app.config import settings

logger = logging.getLogger(__name__)

Expand All @@ -29,12 +30,34 @@
router = APIRouter(prefix="/api/rag", tags=["RAG System"])

# Initialize services
search_agent = SearchAgent()
vector_store = VectorStoreService()
doc_processor = DocumentProcessingService()
bm25_service = BM25SearchService()


# ==== Helper Functions ====
def get_openai_api_key(x_openai_api_key: Optional[str] = Header(None)) -> str:
"""
Get OpenAI API key from header or fallback to settings.

Args:
x_openai_api_key: Optional API key from X-OpenAI-API-Key header

Returns:
API key to use (user-provided or system default)
"""
# Use user-provided key if available, otherwise fallback to settings
api_key = x_openai_api_key or settings.openai_api_key

if not api_key:
raise HTTPException(
status_code=400,
detail="OpenAI API key not provided. Please set X-OpenAI-API-Key header or configure OPENAI_API_KEY in environment."
)

return api_key


# ==== Pydantic Models ====
class QueryRequest(BaseModel):
query: str
Expand All @@ -59,7 +82,8 @@ class DocumentResponse(BaseModel):
async def upload_document(
file: UploadFile = File(...),
method: ProcessingMethod = "pymupdf",
db: Session = Depends(get_postgres_session)
db: Session = Depends(get_postgres_session),
openai_api_key: str = Depends(get_openai_api_key)
):
"""
Upload a PDF document and process it
Expand Down Expand Up @@ -103,7 +127,7 @@ async def upload_document(

# Process document immediately with selected method
try:
await process_document_pipeline(document, db, method=method)
await process_document_pipeline(document, db, method=method, openai_api_key=openai_api_key)
logger.info(f"Document processed successfully with {method}: {document.id}")
except Exception as e:
logger.error(f"Processing failed: {e}")
Expand Down Expand Up @@ -136,16 +160,23 @@ async def upload_document(
@router.post("/query/stream")
async def query_stream(
request: QueryRequest,
db: Session = Depends(get_postgres_session)
db: Session = Depends(get_postgres_session),
openai_api_key: str = Depends(get_openai_api_key)
):
"""
Process query with streaming response

- Generates embeddings for query
- Performs vector search
- Streams LLM response with reasoning

Headers:
X-OpenAI-API-Key: Optional user-provided OpenAI API key
"""
try:
# Create SearchAgent with user-provided API key
search_agent = SearchAgent(openai_api_key=openai_api_key)

# Ensure graph is processed for all documents (happens once on first query)
await search_agent.ensure_graph_processed(db)

Expand All @@ -158,7 +189,7 @@ async def query_stream(

# Process document if not already processed
if not document.is_processed:
await process_document_pipeline(document, db)
await process_document_pipeline(document, db, openai_api_key=openai_api_key)

document_id = request.document_id

Expand Down Expand Up @@ -235,7 +266,8 @@ async def get_document(
@router.post("/documents/{document_id}/process")
async def process_document(
document_id: str,
db: Session = Depends(get_postgres_session)
db: Session = Depends(get_postgres_session),
openai_api_key: str = Depends(get_openai_api_key)
):
"""Manually trigger document processing"""
try:
Expand All @@ -251,7 +283,7 @@ async def process_document(
"document_id": str(document.id)
}

await process_document_pipeline(document, db)
await process_document_pipeline(document, db, openai_api_key=openai_api_key)

return {
"success": True,
Expand Down Expand Up @@ -326,14 +358,16 @@ async def delete_document(
@router.post("/documents/{document_id}/process-graph")
async def process_document_graph(
document_id: str,
db: Session = Depends(get_postgres_session)
db: Session = Depends(get_postgres_session),
openai_api_key: str = Depends(get_openai_api_key)
):
"""
Process document with GraphRAG to extract knowledge graph

Args:
document_id: Document ID to process
db: Database session
openai_api_key: User-provided OpenAI API key

Returns:
Graph processing status and statistics
Expand All @@ -356,8 +390,8 @@ async def process_document_graph(
with open(document.text_filepath, 'r', encoding='utf-8') as f:
text_content = f.read()

# Initialize GraphRAG pipeline for this request
graphrag_pipeline = GraphRAGPipeline()
# Initialize GraphRAG pipeline for this request with user-provided API key
graphrag_pipeline = GraphRAGPipeline(openai_api_key=openai_api_key)

# Process with GraphRAG (chunks text and extracts entities/relationships from each chunk)
graph_result = await graphrag_pipeline.process_document(
Expand Down
13 changes: 10 additions & 3 deletions backend/app/services/embedding_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
Generates embeddings for text chunks
"""
import logging
from typing import List
from typing import List, Optional
import asyncio

from openai import AsyncOpenAI
Expand All @@ -15,8 +15,15 @@
class EmbeddingService:
"""Service for generating vector embeddings using OpenAI"""

def __init__(self):
self.client = AsyncOpenAI(api_key=settings.openai_api_key)
def __init__(self, openai_api_key: Optional[str] = None):
"""
Initialize EmbeddingService with optional user-provided OpenAI API key.

Args:
openai_api_key: Optional OpenAI API key. If not provided, uses settings.openai_api_key
"""
api_key = openai_api_key or settings.openai_api_key
self.client = AsyncOpenAI(api_key=api_key)
self.model = settings.openai_embedding_model
self.embedding_dimension = 1536 # for text-embedding-3-small

Expand Down
5 changes: 3 additions & 2 deletions backend/app/services/graphrag_pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,21 +22,22 @@
class GraphRAGPipeline:
"""Pipeline for extracting knowledge graph from text using GraphRAG"""

def __init__(self, chunk_size: int = 1200, chunk_overlap: int = 400):
def __init__(self, chunk_size: int = 1200, chunk_overlap: int = 400, openai_api_key: str = None):
"""
Initialize GraphRAG Pipeline

Args:
chunk_size: Size of text chunks for entity extraction (default: 1200)
chunk_overlap: Overlap between chunks to capture relationships (default: 400)
openai_api_key: Optional user-provided OpenAI API key
"""
self.chunk_size = chunk_size
self.chunk_overlap = chunk_overlap

self.llm = ChatOpenAI(
model=settings.graphrag_llm_model,
temperature=0,
openai_api_key=settings.openai_api_key
openai_api_key=openai_api_key or settings.openai_api_key
)

# Initialize text splitter for chunking large documents
Expand Down
23 changes: 16 additions & 7 deletions backend/app/services/search_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -183,19 +183,28 @@ async def on_agent_finish(self, finish: AgentFinish, **kwargs):
class SearchAgent:
"""Advanced Search Agent with 3 tools: Vector Search, Graph Search, Filter Search"""

def __init__(self):
def __init__(self, openai_api_key: Optional[str] = None):
"""
Initialize SearchAgent with optional user-provided OpenAI API key.

Args:
openai_api_key: Optional OpenAI API key. If not provided, uses settings.openai_api_key
"""
# Use provided key or fallback to settings
api_key = openai_api_key or settings.openai_api_key

# Initialize LLM with streaming support
self.llm = ChatOpenAI(
model=settings.openai_model,
temperature=0, # Zero temperature for more deterministic, factual responses
openai_api_key=settings.openai_api_key,
openai_api_key=api_key,
streaming=True,
timeout=300, # 5 minute timeout for OpenAI API calls
request_timeout=300 # 5 minute request timeout
)

# Initialize search services
self.embedding_service = EmbeddingService()
# Initialize search services with the same API key
self.embedding_service = EmbeddingService(openai_api_key=api_key)
self.vector_store = VectorStoreService()
self.bm25_search = BM25SearchService()
self.reranker = RerankerService()
Expand All @@ -208,8 +217,8 @@ def __init__(self):
self.agent_executor = None
self.tools = []

# GraphRAG pipeline for processing documents
self.graphrag_pipeline = GraphRAGPipeline()
# GraphRAG pipeline for processing documents with user-provided API key
self.graphrag_pipeline = GraphRAGPipeline(openai_api_key=api_key)
self.graph_processing_started = False

# Initialize Memory Manager
Expand All @@ -223,7 +232,7 @@ def __init__(self):
db_user=settings.memory_db_user,
db_password=settings.memory_db_password,
model_name=settings.memory_model,
openai_api_key=settings.openai_api_key,
openai_api_key=api_key,
session_id=settings.memory_session_id
)
logger.info("Memory Manager initialized successfully")
Expand Down
Loading
Loading