Skip to content

Production-ready RAG pipeline#2

Merged
hrshl4codes merged 6 commits into
mainfrom
production-revamp
May 11, 2026
Merged

Production-ready RAG pipeline#2
hrshl4codes merged 6 commits into
mainfrom
production-revamp

Conversation

@hrshl4codes
Copy link
Copy Markdown
Owner

Summary

  • Wired real RAG pipeline (was serving stub demo responses)
  • Fixed PDF extraction bug: missing dot in extension caused raw binary to be chunked (176 junk chunks → 3 real chunks)
  • Switched embeddings to OpenAI text-embedding-3-small, chat to gpt-4o-mini
  • Added BM25 hybrid search so keyword matches are never missed by vector similarity
  • Non-blocking Pinecone upsert/query via thread executor
  • CI/CD, Docker multi-stage builds, tests, UI redesign, security hardening

Test plan

  • Upload a PDF and verify chunk count is reasonable (not hundreds)
  • Query the document and verify real answer (not "Demo response")
  • Check /health returns 200
  • CI passes on merge

🤖 Generated with Claude Code

hrshl4codes and others added 6 commits May 11, 2026 03:54
- remove hackrx-key.pem from filesystem; already in .gitignore
- fix CORS: drop allow_credentials=True with wildcard origins (invalid per spec)
- fix route ordering bug: API/health routes now registered before SPA catch-all
- add real pytest suite (10 tests covering health, upload, query endpoints)
- rewrite CI: add lint (ruff/eslint), pytest, jest steps with dep caching
- fix docker-compose healthcheck: curl → python stdlib (curl absent in slim image)
- convert Dockerfile.backend to multi-stage build
- remove minimal_main.py; consolidate on documind_main.py as sole entry point
- fix routes.py encapsulation: delete_document delegates to api_service method
- delegate file parsing in routes.py to text_extract.process_file_content
- add delete_document() method to DocuMindAPIService
- add requirements-dev.txt with pytest/ruff/httpx
- redesign UploadView and QueryView: Syne + JetBrains Mono, amber accent,
  corner bracket animations, entry fade-ins, scoped component CSS
- update README project structure section

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
- add frontend/vercel.json (CRA build config + SPA rewrites)
- strip frontend build step from render.yaml (Vercel owns it now)
- remove NODE_VERSION and REACT_APP_API_URL from render.yaml env vars
- remove unused imports (os, asyncio, Tuple, json, uuid, etc.) across services/
- fix 4 bare except -> except Exception in text_extract.py
- prefix unused local variables with _ in cloud_vector_service.py
- add noqa: F401 for conditional imports inside try blocks (pinecone, qdrant)
- Mount services/routes.py router in documind_main.py (was serving stub
  endpoints that returned demo responses)
- Switch embeddings from Gemini to OpenAI text-embedding-3-small (1536d)
- Route chat through OpenAI gpt-4o-mini directly, drop broken Gemini path
- Fix PDF extraction: ext passed without dot ("pdf" vs ".pdf") caused
  raw binary to be chunked instead of extracted text (176 chunks → 3)
- Add BM25 hybrid search over in-memory chunks so keyword matches like
  "Python" are never missed by vector similarity alone
- Run Pinecone upsert and query in thread executor to avoid blocking
  the async event loop
- Expand retrieval candidate pool: max(top_k*4, min(chunks, 60))
- Add per-step timing logs to upload pipeline
- Restore pinecone>=3.0.0 to requirements.txt, clean up render.yaml build
- Update embedding dimension from 3072 → 1536 throughout

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
@hrshl4codes hrshl4codes merged commit 423b19c into main May 11, 2026
2 of 4 checks passed
@hrshl4codes hrshl4codes deleted the production-revamp branch May 11, 2026 06:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant