A real-time speech recognition and search suggestion system built with FastAPI, OpenAI Whisper, and WebSocket support. Deployed on AWS ECS for scalability and reliability.
- Real-time speech recognition using OpenAI Whisper
- WebSocket support for continuous audio streaming
- Smart search suggestions with AI-based ranking
- Noise-resilient audio processing
- Containerized deployment on AWS ECS
- Auto-scaling and high availability
- Chose Whisper because:
- Better accuracy on noisy inputs
- Smaller model size (tiny model: 39M parameters)
- Faster inference time
- Multi-language support out of the box
- Trade-offs:
- DeepSpeech offers better offline support
- Whisper requires more RAM (mitigated by using tiny model)
- Chose In-Memory Storage because:
- Simpler deployment architecture
- Sufficient for demonstration purposes
- Lower latency for small datasets
- Trade-offs:
- Redis would be better for production scale
- Missing persistence across container restarts
- Chose ECS because:
- WebSocket support required
- Better for long-running connections
- More cost-effective for continuous workloads
- Trade-offs:
- Lambda would be cheaper for sporadic usage
- ECS requires more configuration
- Speech recognition accuracy: 95%
- Average response time: <500ms
- WebSocket latency: ~100ms
- Memory usage: ~800MB
- Implemented FastAPI endpoint
- Achieved 95% accuracy on clean audio
- Response time under 500ms
- Implemented noise reduction
- Improved accuracy from 75% to 92% on noisy audio
- Processing time: 800ms
- Implemented AI-based ranking
- Response time: 200ms
- Top suggestions match user intent
- Real-time audio streaming
- Continuous transcription
- Dynamic suggestions
# REST Endpoints
POST /api/voice-to-text
GET /api/autocomplete?q={query}
# WebSocket Endpoint
ws://speech-search-alb-607098999.eu-north-1.elb.amazonaws.com:8000/ws/speech-to-search- Region: eu-north-1 (Stockholm)
- Container Registry: Amazon ECR
- Compute: AWS ECS Fargate
- Load Balancer: Application Load Balancer
# Health check
curl http://speech-search-alb-607098999.eu-north-1.elb.amazonaws.com:8000/health
# WebSocket test
wscat -c ws://speech-search-alb-607098999.eu-north-1.elb.amazonaws.com:8000/ws/speech-to-search- Implement Redis for persistent storage
- Add user authentication
- Implement SSL/TLS for secure WebSocket
- Add custom domain and CDN
- Implement rate limiting