Skip to content

Intelligent Query Routing agent for multi document RAG retrieval systems

Notifications You must be signed in to change notification settings

abhishekgit03/Multi-Document-RAG-Agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 

Repository files navigation

Multi-Document RAG Agent Architecture

The Multi-Document RAG Agent is an intelligent query routing system built to retrieve and process information from large sets of documents efficiently. It uses a Retrieval-Augmented Generation (RAG) framework to select the most relevant documents and passages, significantly improving query response accuracy and efficiency. By generating concise summaries for each document, the system creates a robust mapping mechanism that aligns user queries with the most relevant documents. These summaries encapsulate the core content of documents, streamlining the retrieval process and enabling precise targeting of relevant data.

Diagram 1

This diagram illustrates the passage and summary embeddding creation process.

Features

  • Smart Document Selection: Utilizes vector similarity search combined with document summaries to effectively map queries to the most relevant documents.
  • Granular Retrieval: Breaks documents into passages for fine-tuned retrieval, further narrowing down the search space.
  • Adaptive Query Processing: Differentiates between general and document-specific queries for optimal handling.
  • Efficient Token Usage: Leverages document summaries to minimize unnecessary resource consumption, focusing only on high-relevance data.

Technologies Used

  • LLM: Google Gemini
  • Framework: LangChain
  • Vector Database: Pinecone
  • Language: Python

How It Works

Diagram 2 This diagram illustrates the document selection and retrieval process.

Data Preparation

  1. Document Summarization:

    • Summarize documents using LangChain.
    • Generate embeddings for summaries and store them in a Pinecone namespace summary_embeddings with a unique document ID.
  2. Passage Embedding:

    • Split documents into 1000-word chunks.
    • Convert these passages into embeddings and store them in the passages_embeddings namespace, linked with document IDs.

Query Processing

  1. Query Classification:

    • Use a one-shot classifier to determine if a query is general or requires document retrieval.
  2. Summary Matching:

    • For document queries, generate embeddings and perform similarity searches on summary_embeddings.
    • Retrieve the top 3 document summaries.
  3. Document Selection:

    • Use the retrieved summaries to identify relevant document IDs via a document_selection_prompt.
  4. Passage Matching:

    • Extract the top 10 relevant passages from the selected documents using similarity search on the user query and passage embeddings.
  5. Final Response Generation:

    • Use the retrieved passages and the query to generate the final response.

Key Advantages

  1. Efficiency: Reduces unnecessary token consumption by narrowing down search results.
  2. Flexibility: Handles multi-document queries and retrieves answers spanning multiple documents.
  3. Scalability: Optimized for large document sets with vector search.

Limitations

  • Edge Cases: Struggles with large-scale queries requiring data from many documents (e.g., insights from 40+ documents simultaneously).
  • Potential Information Loss: When multiple documents are retrieved, combining passages may omit some details.

Demo Video: Watch

Note: This project was developed for an assignment provided by RaccoonAI, a Bangalore based startup for SDE(AI) role.

About

Intelligent Query Routing agent for multi document RAG retrieval systems

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published