Releases: dannwaneri/vectorize-mcp-worker
Releases · dannwaneri/vectorize-mcp-worker
V3: Multimodal Search with Vision
🎉 V3: Multimodal Search is Here!
Your RAG system can now "see" - upload images, search by visual content, and extract text automatically.
✨ What's New
Multimodal Features:
- 📸 Image ingestion with Llama 4 Scout vision
- 🔍 OCR text extraction (1,000+ characters)
- 🖼️ Reverse image search
- 📊 Search screenshots, receipts, diagrams
Performance Optimizations:
- ⚡ 60-second cache (0ms cached searches)
- 🚀 Batch embeddings (3x faster ingestion)
- 📄 Pagination support
Real-World Tested:
- ✅ Financial receipts (Access Bank: 1,043 chars extracted)
- ✅ Dashboard screenshots (semantic + OCR matching)
- ✅ Technical diagrams (architecture patterns)
📊 Performance
- Image ingestion: ~7.9s (vision + OCR + embedding)
- First search: ~900ms
- Cached search: 0ms ✨
- Cost: Still $5/month
🚀 Try It
Live Demo: vectorize-mcp-worker.fpl-test.workers.dev/dashboard
Deploy:
git clone https://github.com/dannwaneri/vectorize-mcp-worker.git
cd vectorize-mcp-worker
npm install
wrangler deploy📖 Read More
Full article: https://medium.com/@danielnwaneri41/i-added-image-search-to-my-5-ai-system-openai-charges-100-f9d51549875f
🙏 Credits
Built with:
- Cloudflare Workers AI
- Llama 4 Scout (Meta)
- BGE embeddings (BAAI)
- Vectorize + D1
⭐ Star the repo if this helps your project!