Welcome to the Simple RAG System repository! This project provides two tools to build and interact with a Retrieval-Augmented Generation (RAG) system for PDF documents using LlamaIndex. Explore and query your PDFs with ease using either a Jupyter Notebook for development or a user-friendly Streamlit app for real-time interaction.
- ✅ Upload and Parse PDFs: Process any PDF document for indexing and querying.
- ✅ Ollama Embeddings: Leverage powerful embeddings for accurate content retrieval.
- ✅ Natural Language Queries: Ask questions in plain language and get AI-generated answers.
- ✅ Live Indexing Status: Streamlit app displays real-time indexing progress.
- ✅ LlamaIndex-Powered: Utilizes LlamaIndex for efficient vector storage and retrieval.
To get started, install the required dependencies using pip:
pip install streamlit llama-index llama-index-llms-ollama llama-index-embeddings-ollama
This repository includes two tools for interacting with the RAG system:
Ideal for development, experimentation, and step-by-step exploration.
-
Launch the notebook:
jupyter notebook simple-rag.ipynb
-
Update the PDF file path in the notebook to point to your desired document.
-
Run the cells to index the PDF and interact with the RAG system via queries.
A user-friendly interface for real-time PDF processing and querying.
-
Run the Streamlit app:
streamlit run simple-rag-streamlit.py
-
Upload a PDF file through the web interface.
-
Monitor the indexing progress in real-time.
-
Enter a query in natural language and receive an AI-generated response.
This project is built using the following technologies:
- LlamaIndex: For vector storage and retrieval.
- Streamlit: For the interactive web interface.
- Ollama: For LLM and Embedding models.