A Flask-based web application for neutral document retrieval augmented generation (RAG) using Pathway's document store functionality.
- User authentication and registration
- Secure document upload and storage
- Document isolation by user
- Interactive Q&A based on user's documents
- API endpoints for integration with other systems
graph TD
User[User/Client] --> |Interacts with| WebUI[Web Interface]
WebUI --> |Authenticates via| Auth[Authentication System]
WebUI --> |Uploads| Docs[Document Manager]
WebUI --> |Queries| QA[Q&A Engine]
Docs --> |Stores| DocStore[(Document Storage)]
Docs --> |Indexes| IndexSys[Indexing System]
IndexSys --> |Uses| Pathway[Pathway RAG]
QA --> |Retrieves from| Pathway
Pathway --> |Reads| DocStore
Auth --> |Validates against| UserDB[(User Database)]
sequenceDiagram
participant U as User
participant A as Flask App
participant Auth as Authentication
participant DM as Document Manager
participant R as RAG Engine
participant DB as Database
U->>A: Login Request
A->>Auth: Validate Credentials
Auth-->>DB: Query User
DB-->>Auth: Return User Data
Auth-->>A: Authentication Result
A-->>U: Session Token
U->>A: Upload Document
A->>Auth: Verify Session
Auth-->>A: Session Valid
A->>DM: Store Document
DM->>DB: Record Metadata
DM->>R: Index Document
A-->>U: Upload Confirmation
U->>A: Query Documents
A->>Auth: Verify Session
Auth-->>A: Session Valid
A->>R: Process Query with User Context
R->>DB: Get User Document Paths
DB-->>R: User's Document Paths
R-->>A: Generated Answer
A-->>U: Display Answer
classDiagram
class FlaskApp {
+run()
+handle_routes()
+serve_static_files()
}
class AuthSystem {
+login()
+register()
+logout()
+hash_password()
+check_password()
}
class DocumentManager {
+upload_document()
+list_documents()
+get_document_paths()
}
class QueryEngine {
+process_query()
+format_response()
}
class DatabaseManager {
+init_db()
+query_db()
+insert_record()
}
class PathwayRAG {
+configure()
+query()
+index_documents()
}
FlaskApp --> AuthSystem
FlaskApp --> DocumentManager
FlaskApp --> QueryEngine
DocumentManager --> DatabaseManager
QueryEngine --> PathwayRAG
AuthSystem --> DatabaseManager
PathwayRAG --> DocumentManager
-
Clone the repository:
git clone <repository-url> cd neutral_doc_rag
-
Create and activate a virtual environment:
python3 -m venv .venv source .venv/bin/activate
-
Install dependencies:
pip install -r requirements.txt
-
Create a
.env
file based on the provided.env.example
:cp .env.example .env # Edit .env with your secret settings
-
Create necessary directories:
mkdir -p instance data/ mkdir -p static/js
-
Initialize the database:
flask init-db
-
Run the application:
flask run --host=0.0.0.0 --port=8000
-
Clone the Repository and Navigate to the Project Directory
git clone <repository_url> cd /home/vkrishna04/projects/pathway/neutral_doc_rag
-
Create and Activate a Virtual Environment
python3 -m venv .venv source .venv/bin/activate
-
Install Dependencies
pip install -r requirements.txt
-
Configure Environment Variables
Copy the example file and update it:
cp .env.example .env
Then edit
.env
to set your secret values (do not include your API key in the repository). -
Run the Application
python app.py
POST /login
Content-Type: application/x-www-form-urlencoded
username=user&password=pass
POST /register
Content-Type: application/x-www-form-urlencoded
username=newuser&password=newpass
POST /upload
Content-Type: multipart/form-data
[email protected]
POST /api/query
Content-Type: application/json
{
"question": "What does my document cover?"
}
Response:
{
"answer": "Based on your documents, your document covers..."
}
- Backend: Flask
- Database: SQLite
- Authentication: Bcrypt
- RAG System: Pathway
- Frontend: HTML, Tailwind CSS, JavaScript
- User passwords are securely hashed using bcrypt
- Document isolation ensures users can only access their own documents
- Environment variables for sensitive configuration
- CORS protection for API endpoints
This project is licensed under the MIT License - see the LICENSE file for details.
Check out ROADMAP.md for upcoming features and development plans.
Check out CONTRIBUTING.md for contribution guidelines.