A powerful Retrieval-Augmented Generation (RAG) system for PDF documents using LlamaParse, LangChain, and Groq. This API allows you to upload PDF files, process them into a searchable knowledge base, and query them using natural language.
- π Multi-PDF Processing: Upload and process multiple PDF files simultaneously
- π Intelligent Document Parsing: Uses LlamaParse for high-quality PDF text extraction
- π§ Smart Chunking: Automatically splits documents into optimal chunks for retrieval
- π Semantic Search: Vector-based similarity search using HuggingFace embeddings
- π¬ Natural Language Queries: Ask questions in plain English and get relevant answers
- π Source Attribution: Get references to the source documents for each answer
- πΎ Persistent Storage: Vector database persists between sessions using ChromaDB
- β‘ Fast Inference: Powered by Groq for quick response times
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β PDF Upload βββββΆβ LlamaParse βββββΆβ Text Chunking β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β User Queries ββββββ Groq LLM ββββββ Vector Database β
βββββββββββββββββββ ββββββββββββββββββββ β (ChromaDB) β
βββββββββββββββββββ
β²
β
βββββββββββββββββββ
β HuggingFace β
β Embeddings β
βββββββββββββββββββ
- Python 3.8+
- LlamaCloud API Key (from LlamaIndex)
- Groq API Key (from Groq)
-
Clone the repository
git clone https://github.com/anujgawde/rag-engine.git cd rag-engine -
Install dependencies
pip install -r requirements.txt
-
Set up environment variables Create a
.envfile in the root directory:LLAMA_CLOUD_API_KEY=your_llama_cloud_api_key GROQ_API_KEY=your_groq_api_key EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2 PERSIST_DIR=./chroma_db
-
Run the application
python main.py
The API will be available at http://localhost:8000
Upload and process PDF files into the knowledge base.
Endpoint: POST /upload-pdfs
Request:
curl -X POST "http://localhost:8000/upload-pdfs" \
-H "accept: application/json" \
-H "Content-Type: multipart/form-data" \
-F "[email protected]" \
-F "[email protected]"Response:
{
"status": "success",
"message": "Successfully ingested 2 PDF files",
"files_processed": 2,
"file_names": ["document1.pdf", "document2.pdf"]
}Ask questions about your uploaded documents.
Endpoint: POST /query
Request:
curl -X POST "http://localhost:8000/query" \
-H "accept: application/json" \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "question=What are the main topics discussed in the documents?"Response:
{
"status": "success",
"message": "Query processed successfully",
"query": "What are the main topics discussed in the documents?",
"answer": "Based on the documents, the main topics include...",
"sources": [
{
"content": "Excerpt from the relevant document...",
"metadata": {
"source": "document1.pdf",
"page": 1,
"file_name": "document1.pdf"
}
}
]
}Once the server is running, you can access the interactive API documentation at:
- Swagger UI: http://127.0.0.1:8000/docs
- ReDoc: http://localhost:8000/redoc
| Variable | Description | Default |
|---|---|---|
LLAMA_CLOUD_API_KEY |
API key for LlamaParse service | Required |
GROQ_API_KEY |
API key for Groq LLM service | Required |
EMBEDDING_MODEL |
HuggingFace embedding model | sentence-transformers/all-MiniLM-L6-v2 |
PERSIST_DIR |
Directory to store vector database | ./chroma_db |
You can customize various parameters in the PDFRAG class:
- Chunk Size: Modify
chunk_sizeinRecursiveCharacterTextSplitter - Chunk Overlap: Adjust
chunk_overlapfor better context preservation - Retrieval Count: Change
kparameter in retriever setup - LLM Model: Switch between different Groq models
- Temperature: Adjust creativity vs consistency in responses
# Upload research papers
files = ["paper1.pdf", "paper2.pdf", "paper3.pdf"]
# Query: "What are the main findings across these research papers?"# Upload company manuals, policies, and procedures
files = ["employee_handbook.pdf", "safety_procedures.pdf", "it_policies.pdf"]
# Query: "What is the company's remote work policy?"# Upload contracts and legal documents
files = ["contract1.pdf", "terms_of_service.pdf", "privacy_policy.pdf"]
# Query: "What are the termination clauses in these contracts?"-
"No documents have been ingested yet"
- Make sure you've uploaded PDF files using the
/upload-pdfsendpoint - Check if the vector database directory exists and contains data
- Make sure you've uploaded PDF files using the
-
"Error parsing PDF"
- Ensure your PDF files are not corrupted
- Check if your LlamaCloud API key is valid and has sufficient credits
-
Slow response times
- Consider using a smaller embedding model for faster processing
- Reduce chunk size or retrieval count for quicker responses
-
Memory issues
- For large documents, consider increasing chunk size to reduce total chunks
- Monitor system memory usage during processing
- Ensure your API keys are correctly set in the
.envfile - Check API key validity and quota limits
- Verify network connectivity to external services
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- LlamaIndex for excellent PDF parsing capabilities
- LangChain for the RAG framework
- Groq for fast LLM inference
- ChromaDB for vector storage
- HuggingFace for embedding models
If you encounter any issues or have questions, please:
- Check the troubleshooting section
- Search existing GitHub issues
- Create a new issue with detailed information about your problem
β If you find this project helpful, please consider giving it a star!