Transform your documents into intelligent, queryable knowledge bases with AI-powered podcasts and story generation.
|
|
|
|
flowchart TB
subgraph Client["π₯οΈ Client Layer"]
UI[Next.js Frontend]
Upload[Document Upload]
Chat[Chat Interface]
Podcast[Podcast Studio]
end
subgraph Processing["βοΈ Processing Layer"]
PDF[PDF Pipeline]
OCR[Tesseract.js OCR]
Chunk[Text Chunking]
Embed[Embedding Generation]
end
subgraph Storage["πΎ Storage Layer"]
Supabase[(Supabase)]
Vector[(pgvector)]
Auth[Auth]
end
subgraph AI["π€ AI Layer"]
Groq[Groq LLM]
Cartesia[Cartesia TTS]
GTE[GTE-Small Embeddings]
end
UI --> Upload
UI --> Chat
UI --> Podcast
Upload --> PDF
PDF --> OCR
PDF --> Chunk
Chunk --> Embed
Embed --> GTE
GTE --> Vector
Chat --> Groq
Groq --> Vector
Podcast --> Cartesia
| Category | Technologies |
|---|---|
| Frontend | Next.js 16.1.1, React 19.2.3, TypeScript 5, Tailwind CSS 4 |
| Backend | Next.js Server Actions, Supabase Edge Functions |
| Database | Supabase PostgreSQL, pgvector extension |
| AI/ML | Groq SDK (LLM), Cartesia TTS, GTE-Small Embeddings |
| Document Processing | pdf2json, pdfjs-dist, Mammoth.js, PapaParse, Tesseract.js |
| Protocols | MCP (Model Context Protocol), REST API |
| Authentication | Supabase Auth, Google OAuth 2.0 |
| Styling | Tailwind CSS, Lucide Icons, CVA |
- Node.js >= 18.0.0
- pnpm or npm package manager
- Supabase account (for database & auth)
- Groq API key (for LLM)
- Cartesia API key (for TTS - optional)
# Clone the repository
git clone https://github.com/yourusername/rag-sandbox.git
cd rag-sandbox
# Install dependencies
npm install
# Set up environment variables
cp .env.example .env.localCreate a .env.local file with the following variables:
# Supabase Configuration
NEXT_PUBLIC_SUPABASE_URL=your_supabase_url
NEXT_PUBLIC_SUPABASE_ANON_KEY=your_anon_key
SUPABASE_SERVICE_ROLE_KEY=your_service_role_key
# Groq API (for LLM)
GROQ_API_KEY=your_groq_api_key
# Cartesia API (for TTS)
CARTESIA_API_KEY=your_cartesia_api_key
# App URL
NEXT_PUBLIC_APP_URL=http://localhost:3000Run the following SQL in your Supabase SQL editor:
-- Enable pgvector extension
CREATE EXTENSION IF NOT EXISTS vector;
-- Documents table
CREATE TABLE documents (
id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
user_id UUID REFERENCES auth.users(id) NOT NULL DEFAULT auth.uid(),
name TEXT NOT NULL,
type TEXT NOT NULL,
url TEXT, -- Optional: for file storage path
metadata JSONB,
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Enable RLS on documents
ALTER TABLE documents ENABLE ROW LEVEL SECURITY;
CREATE POLICY "Users can view own documents" ON documents
FOR SELECT TO authenticated USING (auth.uid() = user_id);
CREATE POLICY "Users can insert own documents" ON documents
FOR INSERT TO authenticated WITH CHECK (auth.uid() = user_id);
CREATE POLICY "Users can delete own documents" ON documents
FOR DELETE TO authenticated USING (auth.uid() = user_id);
-- Chunks table with vector embeddings
CREATE TABLE chunks (
id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
document_id UUID REFERENCES documents(id) ON DELETE CASCADE,
content TEXT,
metadata JSONB,
embedding VECTOR(384), -- Matches all-MiniLM-L6-v2 output
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Enable RLS on chunks
ALTER TABLE chunks ENABLE ROW LEVEL SECURITY;
CREATE POLICY "Users can view own chunks" ON chunks
FOR SELECT TO authenticated USING (
EXISTS (
SELECT 1 FROM documents
WHERE documents.id = chunks.document_id
AND documents.user_id = auth.uid()
)
);
-- Function to search for documents
CREATE OR REPLACE FUNCTION match_documents (
query_embedding VECTOR(384),
match_threshold FLOAT,
match_count INT
)
RETURNS TABLE (
id UUID,
content TEXT,
similarity FLOAT
)
LANGUAGE plpgsql
STABLE
AS $$
BEGIN
RETURN QUERY
SELECT
chunks.id,
chunks.content,
1 - (chunks.embedding <=> query_embedding) AS similarity
FROM chunks
JOIN documents ON documents.id = chunks.document_id
WHERE 1 - (chunks.embedding <=> query_embedding) > match_threshold
AND documents.user_id = auth.uid()
ORDER BY chunks.embedding <=> query_embedding
LIMIT match_count;
END;
$$;# Development mode
npm run dev
# Production build
npm run build
npm run startThe application will be available at http://localhost:3000
POST /api/upload
Content-Type: multipart/form-data| Parameter | Type | Description |
|---|---|---|
file |
File |
Document file (PDF, DOCX, CSV, JSON, TXT) |
POST /api/chat
Content-Type: application/json{
"message": "What are the key points in my document?",
"history": []
}POST /api/podcast
Content-Type: application/json{
"documentIds": ["uuid-1", "uuid-2"],
"mode": "discussion" // or "story"
}rag-sandbox/
βββ app/
β βββ actions.ts # Server actions (core logic)
β βββ api/ # API routes
β βββ auth/ # OAuth callback handlers
β βββ login/ # Login page
β βββ sandbox/ # Main application
β βββ layout.tsx # Root layout
βββ components/
β βββ ChatInterface.tsx # RAG chat component
β βββ ClientOCRProcessor.tsx # Browser-based OCR
β βββ GmailConnect.tsx # Gmail integration
β βββ PodcastPlayer.tsx # Audio player
β βββ PodcastStudio.tsx # Podcast generation UI
β βββ landing/ # Landing page components
βββ lib/
β βββ cartesia.ts # TTS integration
β βββ gmail.ts # Gmail API helpers
β βββ mcp.ts # MCP protocol
β βββ ocr-pipeline.ts # OCR processing
β βββ pdf-pipeline.ts # PDF extraction
β βββ podcast.ts # Podcast generation
β βββ vectorize-pipeline.ts # Chunking & embedding
βββ supabase/
β βββ functions/ # Edge functions
β βββ schema.sql # Database schema
β βββ rpc.sql # Stored procedures
βββ public/ # Static assets
- Navigate to the Sandbox page
- Click Upload Document or drag-and-drop files
- Supported formats: PDF, DOCX, CSV, JSON, TXT
- For scanned PDFs, OCR processing runs automatically
- After uploading, use the Chat Interface
- Ask questions about your documents
- Get AI-powered responses with context citations
- View conversation history
- Select documents in the Podcast Studio
- Choose mode:
- Discussion - Interview-style with two voices
- Story - First-person narration
- Click Generate and wait for audio
- Play or download the generated podcast
- Click Connect Gmail button
- Authorize Google account access
- Import emails directly into your knowledge base
- Row-Level Security (RLS) - All data is tenant-isolated
- Secure Authentication - Supabase Auth with OAuth support
- Environment Variables - Sensitive keys never exposed client-side
- API Key Protection - Server-side API calls only
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Supabase - Backend infrastructure
- Groq - Ultra-fast LLM inference
- Cartesia - Premium text-to-speech
- LangChain - Text processing utilities
- Vercel - Deployment platform
Built with β€οΈ using Next.js and Supabase
β Star this repo if you find it helpful!
.png)
.png)
