A mini screenshot annotation and desktop annotation tool based on PyQt5
English | 简体中文
# Currently only recommended for Python 3.8 and Python 3.9
## Source installation
#cd src/
#python setup.py install
# pip installation
pip install ScreenPinKit -i https://pypi.org/simple/
ScreenPinKit
Warning This application uses the third-party library system_hotkey to register global hotkeys. However, since this package hasn't been maintained for over 3 years, it's recommended to install and run it on Python 3.8.
conda create -n pyqt5_env python=3.9
conda activate pyqt5_env
git clone https://github.com/YaoXuanZhi/ScreenPinKit ScreenPinKit
cd ScreenPinKit
pip install -r requirements.txt
git submodule update --init
cd src
python main.py
After installing the ScreenPinKit package via pip and downloading this repository's code, you can run any example program in the src directory, such as:
cd src
python main.py
python ./canvas_editor/demos/canvas_editor_demo_full.py
python ./canvas_item/demos/canvas_arrow_demo.py
# Windows Defender might report it as a virus - just ignore it to complete packaging
cd src
# Explicitly package the OCR environment, requires explicitly importing related dependency modules in ocr_loader_manager.py
pyinstaller --icon=../images/logo.png --add-data "internal_deps:internal_deps" --windowed main.py -n ScreenPinKit
# Implicitly include built-in OCR environment
# pyinstaller --onefile --hidden-import=cv2 --hidden-import=onnxruntime --hidden-import=pyclipper --hidden-import=shapely --icon=../images/logo.png --add-data "internal_deps:internal_deps" --windowed main.py -n ScreenPinKit
# Use the ruff package for syntax checking and automatic code formatting
pip install ruff
# Run as a linter
ruff check
# Run as a formatter
ruff format
Scope | Hotkey | Function |
---|---|---|
Global | F7 | Screenshot |
Global | Shift+F7 | Repeat the last screenshot |
Global | F4 | Call up screen annotation |
Global | F2 | Display clipboard image at mouse position |
Global | Esc | Gradually exit the editing state of the current window |
Screenshot Window | Ctrl+T | Convert screenshot selection to screen pin |
Screenshot Window | Shift | Toggle color format on magnifier (rgb/hex) |
Screenshot Window | C | Copy currently picked color format |
Pin Window | Ctrl+A | OCR recognition |
Pin Window | Alt+F | Toggle mouse click-through state |
Pin Window | Ctrl+C | Copy current pin to clipboard |
Pin Window | Ctrl+S | Save current pin to disk |
Pin Window | Ctrl+W | Complete drawing |
Pin Window | Ctrl+Z | Undo |
Pin Window | Ctrl+Y | Redo |
Pin Window | 3x Space | Clear drawing |
Screen Annotation Window | Alt+L | Hide/show screen annotation content |
Screen Annotation Window | Ctrl+W | Complete drawing |
Please proceed to Youtube - ScreenPinKit
- Snipaste: Snipaste 是一个简单但强大的截图工具,也可以让你将截图贴回到屏幕上
- excalidraw: Design guidelines and toolkits for creating native app experiences
- PyQt-Fluent-Widgets: A fluent design widgets library based on C++ Qt/PyQt/PySide. Make Qt Great Again.
- ShareX: Screen capture, file sharing and productivity tool
- ppInk: An easy to use on-screen annotation software inspired by Epic Pen.
- pyqtgraph: Fast data visualization and GUI tools for scientific / engineering applications
- Jamscreenshot: 一个用python实现的类似微信QQ截屏的工具源码,整合提取自本人自制工具集Jamtools
- EasyCanvas: 基于Qt QGraphicsView的简易画图软件
- PixPin: 功能强大使用简单的截图/贴图工具,帮助你提高效率
TodoList
Testing shows it throws exceptions under Python 3.10, and even on Python 3.8 its exceptions can't be properly caught. Considering it hasn't been maintained for nearly 3 years, comprehensive compatibility handling is needed.
- ✔ Add plugin system
- ✔ Add plugin marketplace UI
Currently using QWebEngineView to implement the OCR text layer, but this solution has high resource usage. Also, the text selection effect isn't ideal and needs further iteration.
- ☐ Currently using QWebEngineView for OCR text layer. Could reference PDF4QT (PDFSelectTextTool class) to implement a lighter version.
Essentially need to rewrite PDFTextLayout and its supporting classes, which is non-trivial work. PDFCharacterPointer.py PDFTextBlock.py PDFTextLayout.py PDFTextLine.py PDFTextSelection.py PDFTextSelectionColoredltem.py TextCharacter.py
- ✔ Build text labels based on recognized paragraphs. Current paragraph selection effect is poor.
Similar to Japanese manga translation effects: erase text on images and fill back with translated text. Consider providing this as a plugin.
- https://ocr.wdku.net/index_pictranslation
- https://www.basiccat.org/zh/imagetrans/
- https://www.basiccat.org/zh/tagged/#imagetrans
- https://www.appinn.com/cotrans-manga-image-translator-regular-edition/#google_vignette
- https://github.com/KUR-creative/SickZil-Machine
- https://www.bilibili.com/read/cv7181027/
- https://github.com/zyddnys/manga-image-translator
- https://github.com/jtl1207/comic-translation
Add color preset functionality for tools like arrows and rectangles. Consider pressing Alt to directly pop up a floating wheel menu for quick selection of presets or custom colors.
Initial implementation had many hardcoded elements as functionality wasn't clear during development. Now that features are stable, we can reorganize this functionality, potentially splitting into DrawToolProvider, DrawToolSchduler, DrawToolFactory modules, or even extracting them as a plugin for more flexible and extensible drawing tool implementation.
After testing, another approach could be embedding web drawing apps like Excalidraw or TlDraw using WebEngineView controls in PinEditorWindow, then modifying the drawing layer background and disabling view zoom/scroll mechanisms, plus adding a demo mode for near-native experience - similar to how many web apps display echarts. Performance impact would be higher but feasible on modern machines.
https://tldraw.dev/examples/use-cases/image-annotator
Further, the drawing layer module could be repackaged as NativeDrawTool, TlDrawEmbedTool, ExcalidrawEmbedTool to save development effort.
Currently recommending TlDrawEmbedTool first as it supports media/GIF file insertion and preview display, offering more utility.
Considering OCR also uses WebEngineView for text selection layer, combining both approaches might be better and more convenient.
Allow users to customize quick workflows through node-based drag-and-drop, like certain automation tasks. Reference projects:
Since Qt is cross-platform, it should theoretically support Linux Desktop, but requires adaptations like hotkey registration adjustments.
# Ubuntu doesn't install openssh-server by default, preventing Vscode Remote-SSH usage
# sudo apt-get install openssh-server
# Install Qt dependency libraries
sudo apt install libxcb-*
# Ubuntu needs xpyb - Python version of XCB
pip install xpybutil
# Package application
sudo apt install binutils
pip install pyinstaller
- Fix black screen when some software screenshots or remote controls on Linux
Some Linux distros default to Wayland display protocol, resulting in black screenshots. Add
WaylandEnable=false
to/etc/gdm3/custom.conf
under[daemon]
section, then reboot.