Skip to content

Deepam02/Real-time-WebRTC-VLM-Multi-Object-Detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

7 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐ŸŽฏ๐Ÿค– Real-time WebRTC VLM Multi-Object Detection

๐ŸŒ Live Demo

๐Ÿš€ Try it now: https://real-time-web-rtc-vlm-multi-object-pi.vercel.app/


An intelligent real-time video streaming and object detection system that combines WebRTC technology with Vision Language Models (VLM) for advanced multi-object detection and analysis.

๐ŸŽฏ How It Works

  1. Laptop (Viewer) creates a session and displays a QR code
  2. Phone scans the QR code and grants camera access
  3. WebRTC establishes peer-to-peer connection for real-time streaming
  4. AI Detection Service analyzes video frames for object detection
  5. Real-time overlay displays detected objects with bounding boxes and labels
  6. Live analytics provides detection statistics and performance metrics

๐Ÿ—๏ธ Architecture

  • Frontend: Next.js (TypeScript) - Real-time video streaming interface
  • Backend: Node.js with Socket.IO - WebRTC signaling server
  • Detection Service: Python FastAPI - YOLOv5 object detection engine
  • WebRTC: Peer-to-peer video streaming with frame extraction
  • AI Models: YOLOv5 for real-time multi-object detection

๐Ÿš€ Quick Start

1. Backend Setup (WebRTC Signaling Server)

cd backend
npm install
npm run dev

2. Detection Service Setup (AI Object Detection)

cd detection-service
pip install -r requirements.txt
python detection_server.py

3. Frontend Setup (Web Interface)

cd frontend
npm install
npm run dev

Visit http://localhost:3000 to start intelligent video streaming with object detection!

๐Ÿ“ฆ Project Structure

Real-time-WebRTC-VLM-Multi-Object-Detection/
โ”œโ”€โ”€ backend/              # Node.js WebRTC signaling server
โ”‚   โ”œโ”€โ”€ server.js        # Main signaling server
โ”‚   โ”œโ”€โ”€ package.json     # Backend dependencies
โ”‚   โ””โ”€โ”€ test-backend.js  # Server testing utilities
โ”œโ”€โ”€ frontend/            # Next.js TypeScript application
โ”‚   โ”œโ”€โ”€ src/
โ”‚   โ”‚   โ”œโ”€โ”€ app/        # Next.js App Router pages
โ”‚   โ”‚   โ”œโ”€โ”€ components/ # React components for detection overlay
โ”‚   โ”‚   โ””โ”€โ”€ utils/      # WebRTC and detection client utilities
โ”‚   โ”œโ”€โ”€ package.json    # Frontend dependencies
โ”‚   โ””โ”€โ”€ next.config.js  # Next.js configuration
โ”œโ”€โ”€ detection-service/   # Python AI detection service
โ”‚   โ”œโ”€โ”€ detection_server.py    # FastAPI detection server
โ”‚   โ”œโ”€โ”€ yolo_detector.py      # YOLOv5 detection engine
โ”‚   โ”œโ”€โ”€ requirements.txt      # Python dependencies
โ”‚   โ”œโ”€โ”€ yolov5n.pt           # Pre-trained YOLO model
โ”‚   โ””โ”€โ”€ test_detector.py     # Detection testing utilities
โ””โ”€โ”€ README.md           # Project documentation

๐ŸŒ Deployment

Backend Deployment (WebRTC Signaling)

  1. Connect your GitHub repo to your preferred platform (Render/Heroku)
  2. Create a new Web Service
  3. Set build command: npm install
  4. Set start command: npm start
  5. Add environment variable: FRONTEND_URL=https://your-deployment-url.com

Frontend Deployment (Web Interface)

  1. Connect your GitHub repo to Vercel/Netlify
  2. Set root directory to frontend/
  3. Add environment variables:
    • NEXT_PUBLIC_SIGNALING_SERVER_URL=https://your-backend-url.com
    • NEXT_PUBLIC_DETECTION_SERVER_URL=https://your-detection-service-url.com
  4. Deploy!

Detection Service Deployment (AI Service)

  1. Deploy to cloud platform supporting Python (Google Cloud Run/AWS Lambda/Heroku)
  2. Install dependencies: pip install -r requirements.txt
  3. Start service: python detection_server.py
  4. Ensure service is accessible via HTTP/HTTPS

๐Ÿ”ง Environment Variables

Frontend (.env)

NEXT_PUBLIC_SIGNALING_SERVER_URL=http://localhost:3001
NEXT_PUBLIC_DETECTION_SERVER_URL=http://localhost:5000

Backend (.env)

PORT=3001
FRONTEND_URL=http://localhost:3000
NODE_ENV=development

Detection Service (.env)

PORT=5000
MODEL_PATH=./yolov5n.pt
DETECTION_THRESHOLD=0.5
MAX_DETECTIONS=100

๐Ÿ“ฑ Usage

  1. Create Session: Visit the web app and click "Start New Detection Session"
  2. Scan QR Code: Use your phone to scan the displayed QR code
  3. Grant Permissions: Allow camera and microphone access on your phone
  4. Enable AI Detection: Toggle object detection to start AI analysis
  5. Start Streaming: Watch live video with real-time object detection overlays
  6. Analyze Results: View detection statistics, confidence scores, and FPS metrics

๐ŸŽฎ AI-Powered Features

  • โœ… Real-time Object Detection - YOLOv5-powered multi-object recognition
  • โœ… Live Video Streaming - WebRTC peer-to-peer video transmission
  • โœ… Detection Overlays - Bounding boxes with confidence scores
  • โœ… QR Code Session Joining - Easy mobile device connection
  • โœ… Performance Metrics - Real-time FPS and detection statistics
  • โœ… Mobile-Optimized Interface - Responsive design for all devices
  • โœ… Camera Switching - Front/back camera toggle support
  • โœ… Automatic Reconnection - Robust connection handling
  • โœ… Session Management - Secure temporary session handling
  • โœ… Multi-Object Support - Detect multiple objects simultaneously
  • โœ… Configurable Thresholds - Adjustable detection confidence levels
  • โœ… Export Detection Results - Save detection data and statistics

๐Ÿ”’ Security & Privacy

  • HTTPS Required: Camera access requires secure connection (except localhost)
  • Peer-to-Peer: Video streams directly between devices (not through server)
  • AI Processing: Detection runs on dedicated service, no data retention
  • Temporary Sessions: Sessions are automatically cleaned up
  • No Recording: No video data is stored on servers
  • Secure Detection: Object detection data is processed in real-time only

๐Ÿ› ๏ธ Development

Prerequisites

  • Node.js 18+
  • Python 3.8+
  • Modern browser with WebRTC support
  • HTTPS for production (camera access requirement)

Local Development

  1. Start detection service: cd detection-service && python detection_server.py
  2. Start backend: cd backend && npm run dev
  3. Start frontend: cd frontend && npm run dev
  4. Visit http://localhost:3000

๐Ÿ“‹ Browser Support

  • โœ… Chrome (Desktop & Mobile) - Full WebRTC + Detection support
  • โœ… Firefox (Desktop & Mobile) - Full WebRTC + Detection support
  • โœ… Safari (Desktop & Mobile) - WebRTC support with detection
  • โœ… Edge (Desktop) - Full feature support
  • โŒ Internet Explorer (not supported)

๐Ÿ› Troubleshooting

Object Detection not working?

  • Ensure detection service is running on port 5000
  • Check detection service health endpoint
  • Verify model file (yolov5n.pt) is present
  • Check detection service logs for errors

Camera not working?

  • Ensure HTTPS connection in production
  • Check browser permissions
  • Try refreshing the page

Connection issues?

  • Check network connectivity
  • Verify environment variables are set correctly
  • Check browser console for WebRTC errors

QR code not scanning?

  • Ensure good lighting conditions
  • Try manual URL entry
  • Check if QR scanner app is working properly

Poor detection performance?

  • Adjust detection threshold settings
  • Check lighting conditions
  • Ensure stable network connection
  • Monitor detection service CPU/memory usage

๐Ÿค Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Test thoroughly
  5. Submit a pull request

๐Ÿ“„ License

This project is open source and available under the MIT License.