feat: Add beautiful web UI with real-time progress and i18n supportFeature/web UI #60

yourwanghao · 2025-10-22T03:27:40Z

## 🎯 Overview

This PR adds a modern, user-friendly web interface for DeepSeek-OCR with comprehensive bilingual support and real-time progress tracking.

## ✨ Features

### 🎨 Modern Web UI
- Beautiful gradient design with smooth animations
- Drag-and-drop file upload
- Responsive design for desktop and mobile
- Zero external frontend dependencies

### 📊 Real-time Progress Tracking
- WebSocket-based live progress updates
- Streaming log display during processing
- Async task processing for concurrent requests

### 📥 Multiple Download Options
- Markdown file (cleaned text)
- Full annotation file (with detection markers)
- Visualization PDF (with bounding boxes)
- Extracted images (ZIP archive)
- Complete package (all files in ZIP)

### 🌍 Internationalization
- Auto-detect browser language (Chinese/English)
- One-click language toggle
- Persist user language preference in localStorage
- Full translation of all UI elements

## 🔧 Technical Implementation

- **Backend**: FastAPI with async/await
- **Real-time Communication**: WebSocket for progress updates
- **Task Processing**: Subprocess monitoring with intelligent parsing
- **File Downloads**: In-memory ZIP creation for efficiency
- **I18n**: Data attributes + localStorage for seamless language switching
- **Zero Breaking Changes**: Completely optional, doesn't affect existing CLI

## 📦 Changes

```
5 files changed, 947 insertions(+)
- .gitignore: Standard Python ignore patterns
- requirements.txt: +4 lines (fastapi, uvicorn, python-multipart, tqdm)
- server/app.py: 752 lines (main web application)
- server/README.md: 60 lines (Chinese documentation)
- server/FEATURES.md: 104 lines (feature details)
```

## 🚀 Usage

```bash
# Install dependencies
pip install -r requirements.txt

# Start server
uvicorn server.app:app --host 0.0.0.0 --port 8000

# Access from browser
http://localhost:8000
```

## ✅ Testing

- ✅ Tested on Ubuntu 22.04 + Python 3.10
- ✅ Tested with 20+ page PDFs
- ✅ Verified all download options
- ✅ Tested language switching
- ✅ Tested on Chrome

## 🎯 Benefits

1. **Accessibility**: Use from any device with a browser
2. **Team Collaboration**: Share OCR service across network
3. **User-friendly**: No CLI knowledge required
4. **International**: Supports Chinese and English users
5. **Real-time Feedback**: See progress, not waiting in the dark

## 🔒 Compatibility

- ✅ No breaking changes
- ✅ Existing CLI remains unchanged
- ✅ Web service is completely optional
- ✅ Minimal new dependencies

## 📚 Documentation

- Added comprehensive README (Chinese + English)
- Added feature documentation
- Added usage examples
- Inline code comments

## 🔮 Future Enhancements

Potential follow-ups (not in this PR):
- Batch file upload queue
- User authentication for public deployment
- Progress persistence across sessions
- OpenAPI/Swagger documentation

---

This PR significantly enhances DeepSeek-OCR's usability while maintaining backward compatibility. Looking forward to your feedback! 🙏

Add a modern, user-friendly web interface for DeepSeek-OCR with the following features: - Beautiful gradient UI with drag-and-drop file upload - Real-time progress tracking via WebSocket - Live log streaming during OCR processing - Multiple download options: * Markdown file (cleaned text) * Full annotation file (with detection markers) * Visualization PDF (with bounding boxes) * Extracted images (ZIP archive) * All files (complete ZIP package) - Responsive design for desktop and mobile - Async processing to handle concurrent requests - In-memory ZIP creation for efficient downloads Technical implementation: - FastAPI backend with async/await - WebSocket for real-time updates - Subprocess monitoring with progress parsing - Modern CSS with animations and transitions - Zero external frontend dependencies This makes DeepSeek-OCR accessible via web browser from any device on the local network, perfect for team collaboration and remote access.

Add comprehensive internationalization support: - Auto-detect browser language (navigator.language) - Toggle between English and Chinese - Persist language preference in localStorage - Translate all UI elements dynamically - Support for: * Page title and subtitle * Upload instructions * Form labels and placeholders * Button text * Progress messages * Error messages * Download links Technical implementation: - Data attributes (data-en, data-zh) for all text elements - Translation dictionary for dynamic messages - Language toggle button in top-right corner - Smooth transitions when switching languages - Fully accessible for international users This makes DeepSeek-OCR accessible to both Chinese and international users, improving adoption and usability.

Hawk Wang added 2 commits October 22, 2025 11:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add beautiful web UI with real-time progress and i18n supportFeature/web UI #60

feat: Add beautiful web UI with real-time progress and i18n supportFeature/web UI #60

yourwanghao commented Oct 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: Add beautiful web UI with real-time progress and i18n supportFeature/web UI #60

Are you sure you want to change the base?

feat: Add beautiful web UI with real-time progress and i18n supportFeature/web UI #60

Conversation

yourwanghao commented Oct 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant