Skip to content

feat: image upload in chat with OCR text extraction#588

Merged
ShaerWare merged 1 commit intomainfrom
feat/chat-image-upload-ocr
Mar 16, 2026
Merged

feat: image upload in chat with OCR text extraction#588
ShaerWare merged 1 commit intomainfrom
feat/chat-image-upload-ocr

Conversation

@ShaerWare
Copy link
Owner

Summary

  • Image upload: JPEG/PNG/WebP/GIF up to 10MB, stored in data/chat_images/{session_id}/
  • OCR: pytesseract extracts text (Russian + English), injected into LLM message as [Текст с изображения] block
  • Lifecycle: images deleted when session is deleted (via ChatService.delete_session hooks)
  • Frontend: upload button, clipboard paste, thumbnail preview, fullscreen viewer, OCR badge
  • DB: extra_data TEXT column on chat_messages (JSON metadata for images)
  • Migration: scripts/migrate_add_extra_data_to_chat_messages.py
  • Dependency: pytesseract>=0.3.10 (optional — requires tesseract-ocr system package)
  • i18n: uploadImage + removeImage keys in ru/en/kk

NEWS

🖼️ Загрузка фото в чат с распознаванием текста

Теперь в чат можно загружать изображения — фото документов, скриншоты,
сканы. Система автоматически распознаёт текст (OCR) и отправляет его
ассистенту вместе с вашим вопросом. Можно вставлять через Ctrl+V
или кнопку загрузки. Фото хранятся ровно столько, сколько живёт чат —
при удалении чата удаляются и все вложения.

Test plan

  • Upload JPEG via button → thumbnail preview appears, send → image in message bubble
  • Paste image from clipboard (Ctrl+V) → same flow
  • Upload image with text → OCR badge appears, LLM receives extracted text
  • Delete chat session → data/chat_images/{session_id}/ directory removed
  • Bulk delete sessions → images cleaned up for all
  • Click image thumbnail in message → fullscreen overlay
  • Upload >10MB file → error message
  • Upload non-image file → rejected
  • Send with only image (no text) → works
  • Migration script: python scripts/migrate_add_extra_data_to_chat_messages.py
  • Build: cd admin && npm run build

🤖 Generated with Claude Code

Upload images (JPEG, PNG, WebP, GIF, max 10MB) directly in admin chat.
Images are stored in data/chat_images/{session_id}/, with thumbnails.
OCR via pytesseract (optional dep) extracts text and injects it into
the message sent to LLM as "[Текст с изображения]" block.

Images have lifecycle tied to chat sessions — when a session is deleted,
its images are cleaned up from disk. Supports paste (Ctrl+V) of images
from clipboard, drag-and-drop via file input, and multiple uploads.

Backend:
- ChatMessage.extra_data column (JSON) stores image metadata
- modules/chat/image_service.py: upload, OCR, thumbnail, cleanup
- Upload endpoint: POST /admin/chat/sessions/{id}/upload-image
- Serve endpoint: GET /admin/chat/images/{session_id}/{filename}
- Image IDs passed in SendMessageRequest.image_ids
- Cleanup hooks in ChatService.delete_session/delete_sessions_bulk

Frontend:
- ChatImage interface, uploadImage API method
- Image upload button (ImagePlus icon) next to mic in chat input
- Pending image thumbnails preview above textarea
- Clipboard paste intercepts image/* types
- Image attachments rendered in message bubbles with click-to-fullscreen
- OCR badge indicator on images with extracted text
- i18n: uploadImage + removeImage in ru/en/kk

Migration: scripts/migrate_add_extra_data_to_chat_messages.py

## NEWS

🖼️ **Загрузка фото в чат с распознаванием текста**

Теперь в чат можно загружать изображения — фото документов, скриншоты,
сканы. Система автоматически распознаёт текст (OCR) и отправляет его
ассистенту вместе с вашим вопросом. Можно вставлять через Ctrl+V
или кнопку загрузки. Фото хранятся ровно столько, сколько живёт чат —
при удалении чата удаляются и все вложения.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ShaerWare ShaerWare merged commit 0329342 into main Mar 16, 2026
3 checks passed
@ShaerWare ShaerWare deleted the feat/chat-image-upload-ocr branch March 16, 2026 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant