Skip to content

dev-pratap-singh/LYZR-Hackathon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

6 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Advanced RAG System with GraphRAG & Multi-Tool Search

An intelligent RAG system combining vector search, knowledge graphs, and smart memory for comprehensive document analysis and conversational AI.

โœ… Production Ready โ€ข Version 2.0.0 โ€ข 94.32% F1 Score


๐ŸŽ‰ What's New in v2.0

๐ŸŒ™ Dark Mode

Beautiful theme with smooth transitions, localStorage persistence, and optimized color palette for comfortable viewing.

๐Ÿง  Smart Memory

Context-aware conversations that remember previous Q&A, instant answers from cache, 60-70% accuracy in complex queries.

๐Ÿ”‘ Bring Your Own API Key

Use your own OpenAI API key for complete control over costs and usage. Keys are encrypted at rest with AES encryption and never stored on the server.

โœ๏ธ Graph Updates

9 natural language operations to create, delete, merge, and connect nodes - no Cypher needed!

๐ŸŽจ Enhanced UI

Pure white text in dark mode, markdown rendering with code blocks, smooth animations throughout.


๐Ÿ“ธ Demo & Screenshots

๐Ÿ”‘ Bring Your Own API Key

User API Key Configuration
Secure API key management - Encrypted at rest, never stored on server

๐Ÿ’ฌ Chat Interface & Document Upload

Search Box and Document Upload
Intuitive UI with drag-and-drop PDF upload and real-time search

๐Ÿค– Multi-Tool Search Agent in Action

Search Agent in Action
Agent intelligently selects tools and streams reasoning steps

๐Ÿง  Smart Memory Management & Token Tracking

Memory State and Token Tracking
Real-time memory state visualization with cost tracking and context utilization

๐Ÿ’ญ Reasoning with Memory Integration

Reasoning with Smart Memory
Transparent reasoning process with memory-first strategy for instant answers

๐Ÿ•ธ๏ธ Knowledge Graph Visualization

Interactive Knowledge Graph
Interactive D3.js graph with clickable nodes and relationship exploration

๐Ÿ“Š Entities & Relationships Data

Entities and Relationships
Detailed entity and relationship tables extracted from documents

๐ŸŒ Multi-Document Knowledge Graph

Multi-Document Graph
Unified knowledge graph spanning multiple documents with cross-document connections

๐Ÿ”„ 3-Pass GraphRAG Enrichment

3-Pass Graph Creation
Multi-pass enrichment: Initial extraction โ†’ Missing entities โ†’ Indirect relationships

๐Ÿš€ Quick Start

# 1. Clone repository with submodules
git clone --recurse-submodules <repository-url>
cd LYZR-Hackathon

# If already cloned without submodules:
git submodule update --init --recursive

# 2. Setup environment
cp .env-example .env
# Edit .env with your OpenAI API key

# 3. Start services
docker-compose up --build

# 4. Access the system
# Frontend:  http://localhost:3000
# API Docs:  http://localhost:8000/docs
# Neo4j:     http://localhost:7474

Prerequisites: Docker, OpenAI API Key, 8GB+ RAM

โš ๏ธ Important: This project uses a git submodule for the memory system from https://github.com/dev-pratap-singh/memory. Make sure to clone with --recurse-submodules flag or run git submodule update --init --recursive after cloning.


โœจ Core Features

๐Ÿ”‘ User API Key Management

Complete control over your OpenAI costs:

  • Bring Your Own Key: Use your personal OpenAI API key
  • Encrypted at Rest: AES encryption with device-specific fingerprinting
  • Never Server-Stored: Keys stay in your browser's localStorage
  • Zero Trust: Server never persists your key, only uses it for requests
  • Easy Setup: Configure via Settings modal in the UI
  • Fallback Support: System can use environment key if user key not provided

๐Ÿ” Multi-Tool Search Agent

4 intelligent search tools that auto-select or run in parallel:

  • Vector Search - Semantic understanding (5x/3x retrieval multipliers)
  • Graph Search - Multi-hop relationship traversal (1-hop, 2-hop)
  • Filter Search - Metadata and date filtering via Elasticsearch
  • Graph Update - Natural language graph modifications

๐Ÿ•ธ๏ธ GraphRAG with 3-Pass Enrichment

  • Pass 1: Broad entity extraction
  • Pass 2: Find missing referenced entities
  • Pass 3: Discover indirect relationships
  • Result: 30-50% richer knowledge graphs, 25 concurrent chunks

๐Ÿง  Smart Memory Management

  • Memory-First: Checks history before searching documents
  • Cost Savings: Instant cached responses
  • Performance: 60-70% accuracy in complex needle-in-haystack tests
  • Control: One-click memory clearing

๐Ÿ“Š Exceptional Performance

Metric Score
Context Precision 99.99%
Context Recall 94.32%
F1 Score 94.32%
Memory Speed Instant
Query Speed <2s

๐Ÿ—๏ธ Architecture

System Overview

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                      React Frontend                            โ”‚
โ”‚  โ€ข Dark Mode UI  โ€ข Real-time Streaming  โ€ข Graph Visualization  โ”‚
โ”‚  โ€ข Memory State Display  โ€ข Token Usage Tracking                โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ”‚ HTTP/SSE
                               โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    FastAPI Backend (v2.0)                      โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”      โ”‚
โ”‚  โ”‚              ๐Ÿง  Memory Manager                       โ”‚      โ”‚
โ”‚  โ”‚  โ€ข Conversation History  โ€ข Token Tracking            โ”‚      โ”‚
โ”‚  โ”‚  โ€ข Context Compression   โ€ข Memory-First Strategy     โ”‚      โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜      โ”‚
โ”‚                             โ”‚                                  โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”      โ”‚
โ”‚  โ”‚           ๐Ÿค– Multi-Tool Search Agent                 โ”‚      โ”‚
โ”‚  โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚      โ”‚
โ”‚  โ”‚  โ”‚  Vector    โ”‚ โ”‚  Graph   โ”‚ โ”‚  Filter  โ”‚ โ”‚ Graph โ”‚  โ”‚      โ”‚
โ”‚  โ”‚  โ”‚  Search    โ”‚ โ”‚  Search  โ”‚ โ”‚  Search  โ”‚ โ”‚Update โ”‚  โ”‚      โ”‚
โ”‚  โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚      โ”‚
โ”‚  โ”‚  โ€ข Smart Tool Selection  โ€ข MAX_PERFORMANCE Mode      โ”‚      โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜      โ”‚
โ”‚                             โ”‚                                  โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”      โ”‚
โ”‚  โ”‚         ๐Ÿ•ธ๏ธ GraphRAG Pipeline (3-Pass)                โ”‚      โ”‚
โ”‚  โ”‚  Pass 1: Entity Extraction                           โ”‚      โ”‚
โ”‚  โ”‚  Pass 2: Missing Entities                            โ”‚      โ”‚
โ”‚  โ”‚  Pass 3: Indirect Relationships                      โ”‚      โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€-โ”˜
                             โ”‚                                โ”‚
         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”
         โ”‚                   โ”‚                                โ”‚      โ”‚
         โ–ผ                   โ–ผ                                โ–ผ      โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€---โ”
โ”‚   PostgreSQL    โ”‚  โ”‚   PGVector   โ”‚  โ”‚    Neo4j    โ”‚  โ”‚Elasticsearchโ”‚
โ”‚   (Metadata +   โ”‚  โ”‚  (Embeddings โ”‚  โ”‚  (Knowledge โ”‚  โ”‚  (Metadata  โ”‚
โ”‚    Memory)      โ”‚  โ”‚   1536-dim)  โ”‚  โ”‚    Graph)   โ”‚  โ”‚   Search)   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€---โ”€โ”˜

Agent Tool Choice Workflow

                      โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                      โ”‚   User Query        โ”‚
                      โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                 โ”‚
                                 โ–ผ
                      โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                      โ”‚  ๐Ÿง  Memory Check    โ”‚
                      โ”‚  (Memory-First)     โ”‚
                      โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                 โ”‚
                โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                โ”‚                                 โ”‚
                โ–ผ Found                           โ–ผ Not Found
    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚  Return Cached     โ”‚              โ”‚  Document Search    โ”‚
    โ”‚  Answer (Instant)  โ”‚              โ”‚  Required           โ”‚
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                                   โ”‚
                                     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                                     โ”‚                           โ”‚
                                     โ–ผ MAX_PERFORMANCE=true      โ–ผ Standard Mode
                        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                        โ”‚  ๐Ÿš€ Run All Tools        โ”‚   โ”‚  ๐Ÿค– Agent Selects     โ”‚
                        โ”‚  in Parallel:            โ”‚   โ”‚  Best Tool(s):       โ”‚
                        โ”‚  โ€ข Vector Search         โ”‚   โ”‚                      โ”‚
                        โ”‚  โ€ข Graph Search          โ”‚   โ”‚  Decision Logic:     โ”‚
                        โ”‚  โ€ข Filter Search         โ”‚   โ”‚                      โ”‚
                        โ”‚  Then synthesize results โ”‚   โ”‚  โœ๏ธ  "Create/Delete" โ”‚
                        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚     โ†’ graph_update   โ”‚
                                                       โ”‚                      โ”‚
                                                       โ”‚  ๐Ÿ•ธ๏ธ  "Who is X?"     โ”‚
                                                       โ”‚     "How X relates Y"โ”‚
                                                       โ”‚     โ†’ graph_search   โ”‚
                                                       โ”‚                      โ”‚
                                                       โ”‚  ๐Ÿ“š  "What is X?"    โ”‚
                                                       โ”‚     "Explain..."     โ”‚
                                                       โ”‚     โ†’ vector_search  โ”‚
                                                       โ”‚                      โ”‚
                                                       โ”‚  ๐Ÿ”  "Docs from 2023"โ”‚
                                                       โ”‚     โ†’ filter_search  โ”‚
                                                       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                                                โ”‚
                                                                โ–ผ
                                                       โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                                                       โ”‚  Synthesize Results  โ”‚
                                                       โ”‚  Store in Memory     โ”‚
                                                       โ”‚  Stream to Frontend  โ”‚
                                                       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Data Flow: Document Upload & Processing

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Upload PDF  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
       โ”‚
       โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Backend: Document Processing                               โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  1. Extract Text (PyMuPDF/Docling)                          โ”‚
โ”‚  2. Chunk Text (1200 chars, 500 overlap)                    โ”‚
โ”‚     โ†“                                                       โ”‚
โ”‚  3. Generate Embeddings (OpenAI text-embedding-3-large)     โ”‚
โ”‚     โ†“                                                       โ”‚
โ”‚  4. GraphRAG 3-Pass Enrichment                              โ”‚
โ”‚     โ€ข Pass 1: Extract entities/relationships                โ”‚
โ”‚     โ€ข Pass 2: Find referenced entities                      โ”‚
โ”‚     โ€ข Pass 3: Discover indirect connections                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜
     โ”‚                โ”‚                  โ”‚              โ”‚
     โ–ผ                โ–ผ                  โ–ผ              โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚PostgreSQLโ”‚  โ”‚   PGVector   โ”‚  โ”‚    Neo4j    โ”‚  โ”‚Elasticsearch โ”‚
โ”‚          โ”‚  โ”‚              โ”‚  โ”‚             โ”‚  โ”‚              โ”‚
โ”‚โ€ข Metadataโ”‚  โ”‚โ€ข Embeddings  โ”‚  โ”‚โ€ข Entities   โ”‚  โ”‚โ€ข Text Index  โ”‚
โ”‚โ€ข Filenameโ”‚  โ”‚โ€ข Chunks      โ”‚  โ”‚โ€ข Relations  โ”‚  โ”‚โ€ข Metadata    โ”‚
โ”‚โ€ข Status  โ”‚  โ”‚โ€ข Vectors     โ”‚  โ”‚โ€ข Properties โ”‚  โ”‚โ€ข Highlights  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Data Flow: Query Processing

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  User Query  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
       โ”‚
       โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Step 1: Memory Check (PostgreSQL)                         โ”‚
โ”‚  โ€ข Search conversation history                             โ”‚
โ”‚  โ€ข Semantic keyword matching                               โ”‚
โ”‚  โ€ข If found โ†’ Return cached answer (FAST PATH)             โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚ Not in memory
         โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Step 2: Tool Execution                                    โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  ๐Ÿ“š Vector Search (PGVector + BM25)                        โ”‚
โ”‚  โ€ข Query embedding โ†’ Similarity search                     โ”‚
โ”‚  โ€ข Retrieve top-kร—5 chunks                                 โ”‚
โ”‚  โ€ข Rerank with cross-encoder โ†’ top-kร—3                     โ”‚
โ”‚  โ€ข Expand context (ยฑ2 adjacent chunks)                     โ”‚
โ”‚                                                            โ”‚
โ”‚  ๐Ÿ•ธ๏ธ  Graph Search (Neo4j)                                  โ”‚
โ”‚  โ€ข Entity extraction from query                            โ”‚
โ”‚  โ€ข 1-hop traversal (direct connections)                    โ”‚
โ”‚  โ€ข 2-hop traversal (indirect connections)                  โ”‚
โ”‚  โ€ข Return entity network with relationships                โ”‚
โ”‚                                                            โ”‚
โ”‚  ๐Ÿ” Filter Search (Elasticsearch)                          โ”‚
โ”‚  โ€ข Extract filters (date, author, category)                โ”‚
โ”‚  โ€ข Metadata-based search                                   โ”‚
โ”‚  โ€ข Return matching documents with highlights               โ”‚
โ”‚                                                            โ”‚
โ”‚  โœ๏ธ  Graph Update (Neo4j)                                  โ”‚
โ”‚  โ€ข Parse update command                                    โ”‚
โ”‚  โ€ข Execute CRUD operations on graph                        โ”‚
โ”‚  โ€ข Return success/failure status                           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚
         โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Step 3: LLM Synthesis                                     โ”‚
โ”‚  โ€ข Combine results from tools                              โ”‚
โ”‚  โ€ข Generate comprehensive answer                           โ”‚
โ”‚  โ€ข Format with proper markdown                             โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚
         โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Step 4: Memory Storage (PostgreSQL)                       โ”‚
โ”‚  โ€ข Store query + response                                  โ”‚
โ”‚  โ€ข Track token usage                                       โ”‚
โ”‚  โ€ข Update memory state                                     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚
         โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Step 5: Stream to Frontend                                โ”‚
โ”‚  โ€ข SSE events (thinking, tool_start, tool_end)             โ”‚
โ”‚  โ€ข Final answer with formatting                            โ”‚
โ”‚  โ€ข Memory state + token usage                              โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โš™๏ธ Configuration

# API & Credentials
OPENAI_API_KEY=sk-proj-your-key    # Optional: Can be provided by users via UI
POSTGRES_PASSWORD=your_password
NEO4J_AUTH=neo4j/your_password

# Performance Features
MAX_PERFORMANCE=false              # Run all tools in parallel
GRAPHRAG_ENABLE_MULTIPASS=true     # 3-pass enrichment

# Optimal RAG Settings
CHUNK_SIZE=1200
CHUNK_OVERLAP=500
TOP_K_RESULTS=20

User API Key Setup

Users can provide their own OpenAI API key through the UI:

  1. Click Settings Icon (โš™๏ธ) in the top-right corner
  2. Enter Your OpenAI API Key in the modal
  3. Save - Key is encrypted and stored in browser localStorage
  4. Use the System - All API calls use your key automatically

Security Features:

  • ๐Ÿ” AES Encryption with browser fingerprint-based key derivation
  • ๐Ÿ  Client-Side Storage - Keys never leave your browser
  • ๐Ÿ”’ Zero Server Persistence - Backend receives keys via headers only
  • ๐Ÿ”„ Easy Management - Clear/update key anytime via Settings

๐Ÿ“ API Examples

With User-Provided API Key

# Upload document with your API key
curl -X POST http://localhost:8000/api/rag/upload \
  -H "X-OpenAI-API-Key: sk-proj-your-key-here" \
  -F "file=@document.pdf"

# Query with your API key
curl -X POST http://localhost:8000/api/rag/query/stream \
  -H "Content-Type: application/json" \
  -H "X-OpenAI-API-Key: sk-proj-your-key-here" \
  -d '{"query": "What is this about?", "document_id": "your-id"}'

# Update graph with your API key
curl -X POST http://localhost:8000/api/rag/query/stream \
  -H "Content-Type: application/json" \
  -H "X-OpenAI-API-Key: sk-proj-your-key-here" \
  -d '{"query": "Create AI node and connect to Python, ML", "document_id": "your-id"}'

Without User API Key (uses system default)

# Upload document (uses env OPENAI_API_KEY)
curl -X POST http://localhost:8000/api/rag/upload \
  -F "file=@document.pdf"

# Query without custom key
curl -X POST http://localhost:8000/api/rag/query/stream \
  -H "Content-Type: application/json" \
  -d '{"query": "What is this about?", "document_id": "your-id"}'

# Clear memory
curl -X DELETE http://localhost:8000/api/memory/clear

Note: The X-OpenAI-API-Key header is optional. If not provided, the system falls back to the OPENAI_API_KEY from environment variables.

Full API documentation: http://localhost:8000/docs


๐Ÿงช Testing

# Run unit tests (27% coverage)
docker exec lyzr-hackathon-backend-1 pytest test/unit_tests/ -v

# Run RAGAS evaluation
docker exec lyzr-hackathon-backend-1 pytest test/integration_tests/ -v

See test/README.md for detailed testing documentation.


๐Ÿ“‚ Project Structure

LYZR-Hackathon/
โ”œโ”€โ”€ backend/          # FastAPI + Search Agent + Memory + GraphRAG
โ”œโ”€โ”€ frontend/         # React + Dark Mode + Graph Visualization
โ”œโ”€โ”€ memory/           # Git submodule: Long-term memory system (Lyzr)
โ”‚                     # Source: https://github.com/dev-pratap-singh/memory
โ”œโ”€โ”€ test/             # Unit tests + RAGAS evaluation
โ”œโ”€โ”€ docker-compose.yml
โ””โ”€โ”€ .env-example

Memory Submodule: The memory/ directory is a git submodule containing the Lyzr long-term memory implementation. It provides:

  • Conversation tracking and history management
  • Memory facts and user preferences storage
  • Training history for model fine-tuning
  • Vector-based semantic search for memory retrieval

"Graph traversal timeout":

  • Multi-hop traversal can be slow on very large graphs
  • Check Neo4j performance
  • Consider limiting 2-hop traversal depth

๐Ÿ”ฎ Future Enhancements

  • SLM for Graph Creation: Use Gemma-3-8B to reduce costs
  • Microsoft GraphRAG: Full hierarchical clustering implementation
  • Visual Image RAG: Late interaction models for image retrieval
  • Embedding-based Memory: True semantic search vs keyword matching
  • Multi-Document Evolution: Stress test with 100+ documents

๐Ÿ“œ Version History

v2.0.0 (Current) - October 15, 2025

Major Features:

  • ๐Ÿ”‘ Bring Your Own API Key - User-provided OpenAI keys with AES encryption
  • ๐ŸŒ™ Complete Dark Mode - Beautiful theme with localStorage persistence
  • ๐Ÿง  Smart Memory System - Memory-first strategy with instant cached responses
  • ๐ŸŽจ Enhanced UI - Pure white text in dark mode, improved markdown rendering
  • ๐Ÿ”’ Zero-Trust Security - API keys encrypted at rest, never stored on server

v1.2.0 - October 13, 2025

Natural language graph updates โ€ข 9 operations โ€ข Batch connections โ€ข Real-time refresh

v1.1.0 - October 13, 2025

Multi-hop traversal โ€ข 3-pass enrichment โ€ข MAX_PERFORMANCE mode โ€ข F1 Score 94.32%

v1.0.0 - October 12, 2025

Initial release โ€ข Multi-tool agent โ€ข Hybrid search โ€ข RAGAS evaluation


๐Ÿ‘ค Author

Dev Pratap Singh โ€ข Senior AI Engineer โ€ข IIT Goa

LinkedIn


๐ŸŽฏ Acknowledgments

Special thanks to the team for organizing this hackathon. If I don't win, I'd love to meet the team in Bangalore for coffee! โœŒ๏ธ


Last Updated: October 15, 2025 โ€ข Status: โœ… Production Ready โ€ข Version: 2.0.0

About

LYZR-Hackathon

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors