A voice-based interview preparation assistant built with LiveKit. Talk through algorithms, get practice problems, and have real discussions about optimization strategies.
I've been solving DSA problems for years (1200+ on LeetCode) - not just for interviews, but because I genuinely enjoy them. There's something satisfying about finding the optimal solution to a hard problem. It's like solving puzzles.
But here's the thing: when you're stuck on an optimization or want to talk through an approach, you usually have to type it all out in ChatGPT or search through forums. I wanted something more natural - like having a conversation with a senior engineer friend who can discuss time complexity, walk through approaches, and suggest practice problems.
That's CodeCoach. I can literally just say "hey, what's a better way to solve this?" and have a back-and-forth discussion instead of typing everything out.
Persona: Supportive senior engineer friend - encouraging, practical, keeps it concise for voice.
Try it: Coming soon
Video Demo: Coming soon
┌─────────────────────────────────────────────────┐
│ React Frontend (Vercel) │
│ [Start Call] [Live Transcript] [End Call] │
└─────────────────────────────────────────────────┘
│ WebSocket
▼
┌─────────────────────────────────────────────────┐
│ LiveKit Cloud │
└─────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Python Voice Agent (Render) │
│ Deepgram(STT) → GPT-4o-mini → ElevenLabs(TTS) │
│ ↓ │
│ RAG: FAISS + LangChain │
└─────────────────────────────────────────────────┘
| Component | Technology | Why I chose it |
|---|---|---|
| Voice | LiveKit Cloud | Required for real-time audio |
| STT | Deepgram | Fast and accurate transcription |
| LLM | GPT-4o-mini | Quick responses needed for voice |
| TTS | ElevenLabs (OpenAI fallback) | Natural sounding voice |
| RAG | LangChain + FAISS | In-memory, no DB setup needed |
| Embeddings | OpenAI text-embedding-3-small | Fast API calls, no local model to load |
| Frontend | React + TypeScript + Tailwind | Clean, modern stack |
- Voice conversation: Talk naturally about coding interview topics
- Live transcript: See the conversation in real-time as you speak
- RAG-powered answers: Retrieves relevant info from CTCI chapter
- Practice problems: Say "give me a medium array problem" and it fetches one
- Observability: Logs every RAG retrieval with chunk IDs and scores
Why OpenAI embeddings instead of HuggingFace?
I actually started with HuggingFace all-MiniLM-L6-v2 since it's free and runs locally. But I kept hitting issues:
- The model download (~90MB) was blocking the agent startup
- LiveKit's prewarm phase has a timeout, and loading the model took too long
- Tried pre-building the FAISS index separately, but the download kept hanging
Ended up switching to OpenAI text-embedding-3-small. It's API-based so there's no model to download - just makes a quick API call. Initialization went from 60+ seconds to about 2 seconds. Small cost trade-off but totally worth it for the UX.
macOS FAISS fix
Ran into a weird crash on macOS - libomp.dylib already initialized. Turns out FAISS and some other libs both try to load OpenMP. Fixed it by setting KMP_DUPLICATE_LIB_OK=TRUE at the top of the agent. Took a bit of googling to figure that one out.
Pre-built FAISS index
Initially had the FAISS index in .gitignore. Realized anyone cloning would have to rebuild it before running. Removed it from gitignore and committed the pre-built index - now it's clone and run.
Why FAISS over ChromaDB?
FAISS loads from a file and runs in-memory. Cloud platforms with ephemeral disks can cause issues with ChromaDB. Simpler is better here.
Why single CTCI chapter?
One chapter with precise retrieval beats the whole book with noisy results.
Why chunk size 500?
Smaller chunks = more precise retrieval. Voice responses should be concise anyway.
Why observability logging?
Good observability matters when debugging AI systems:
Query: 'what is the time complexity of hash table lookup'
#1 chunk=chunk_12 page=5 score=0.823
#2 chunk=chunk_15 page=6 score=0.756
The FAISS index is pre-built, so you just need to add your API keys and run.
cd backend
pip install -r requirements.txt
# Copy and fill in your API keys
cp .env.example .env
# Edit .env with your LiveKit, OpenAI, Deepgram keys
# Terminal 1 - token server
python token_server.py
# Terminal 2 - voice agent
python agent.py devcd frontend
npm install
# Copy and fill in env
cp .env.example .env
npm run devGo to http://localhost:5173 and click "Start Call"
Backend .env:
LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=...
LIVEKIT_API_SECRET=...
OPENAI_API_KEY=...
DEEPGRAM_API_KEY=...
# Optional
ELEVENLABS_API_KEY=...
USE_ELEVENLABS=true
ENABLE_RAG=true
Frontend .env:
VITE_LIVEKIT_URL=wss://your-project.livekit.cloud
VITE_TOKEN_SERVER_URL=http://localhost:8080
├── backend/
│ ├── agent.py # Voice agent entry point
│ ├── rag.py # FAISS + LangChain RAG pipeline
│ ├── tools.py # get_practice_problems tool
│ ├── prompts.py # CodeCoach system prompt
│ ├── token_server.py # JWT token endpoint
│ ├── start.sh # Combined startup script for deployment
│ ├── .env.example # Environment variables template
│ └── data/ # CTCI PDF + FAISS index
├── frontend/
│ ├── src/App.tsx # React UI
│ └── .env.example # Environment variables template
└── README.md
Deployed on Render (backend) + Vercel (frontend). Start script: backend/start.sh.