|
Air Canvas is an innovative computer vision application that transforms your webcam into a touch-free digital drawing tablet. By leveraging the power of MediaPipe for precision hand tracking and Depth-Anything V2 for intelligent depth perception, Air Canvas can distinguish between you hovering your hand over the canvas versus actually touching it โ just like a real pen on paper! Whether you're sketching, prototyping, or just having fun, Air Canvas makes digital art accessible without any physical touch. |
|
| Feature | Description |
|---|---|
| ๐๏ธ AI Hand Tracking | Ultra-responsive finger detection using MediaPipe with Kalman filtering for silky-smooth cursor movement |
| ๐ง Depth Perception | Depth-Anything V2 integration enables hover vs. draw distinction โ lift to pause, lower to draw! |
| โ๏ธ Free Draw | Smooth strokes with Catmull-Rom spline interpolation for natural, flowing lines |
| ๐ Shape Tools | Instant geometric primitives: Lines, Rectangles, and Circles |
| ๐งฝ Eraser | Multiple eraser sizes for precise corrections |
| ๐จ Dynamic Palette | Real-time color and brush size selection via virtual UI |
| ๐ค OCR Integration | Convert handwritten air-drawings to digital text using Tesseract |
| ๐ธ Screenshot Export | Save your artwork instantly with a single keystroke |
| ๐ง Live Config Reload | Modify config.yaml and apply changes without restarting |
flowchart TD
subgraph Input["๐น Input Layer"]
A[Webcam Feed]
end
subgraph Processing["๐ง AI Processing"]
B[MediaPipe Hands]
C[Depth-Anything V2]
D[Kalman Filter]
end
subgraph Logic["โก Application Logic"]
E[Gesture Detection]
F[Drawing Engine]
G[UI Manager]
end
subgraph Output["๐ฅ๏ธ Output Layer"]
H[Canvas Window]
I[Depth Visualization]
J[Webcam Preview]
end
A --> B --> D --> E
A --> C --> E
E --> F --> H
E --> G --> J
C --> I
style Input fill:#4f46e5,color:#fff
style Processing fill:#7c3aed,color:#fff
style Logic fill:#a855f7,color:#fff
style Output fill:#c084fc,color:#fff
Important
Make sure you have the following installed before proceeding:
- Python 3.8+ with pip
- CUDA-capable GPU (recommended for depth estimation)
- Webcam connected to your system
git clone https://github.com/Rcidshacker/Air-canvas-new.git
cd Air-canvas-new๐ช Windows
python -m venv .depthv2
.\.depthv2\Scripts\activate๐ง Linux / ๐ macOS
python3 -m venv .depthv2
source .depthv2/bin/activatepip install -r requirements.txt๐ฆ Click to expand installation instructions
| Platform | Installation Command |
|---|---|
| Windows | Download from UB-Mannheim/tesseract |
| Linux | sudo apt-get install tesseract-ocr |
| macOS | brew install tesseract |
[!NOTE] Windows Users: Ensure the path in
depthtrack.py(line 18) matches your Tesseract installation path. Default:C:\Program Files\Tesseract-OCR\tesseract.exe
Caution
This step is mandatory! The depth model weights are too large to include in the repository.
- Visit the Depth-Anything V2 Repository
- Download
depth_anything_v2_vits.pth - Create a
model_weights/folder in the project root - Place the
.pthfile inside:
๐ฆ Air-canvas-new/
โ ๐ model_weights/
โ ๐ depth_anything_v2_vits.pth โ Place here!
python depthtrack.py| Gesture | Visual | Action |
|---|---|---|
| Point | โ๏ธ | Move cursor (hover mode) |
| Pinch | ๐ค | Draw / Select (thumb + index finger close) |
| Release | โ | Stop drawing (fingers apart) |
| Key | Icon | Function |
|---|---|---|
| R | ๐๏ธ | Reset โ Clear the entire canvas |
| S | ๐ธ | Screenshot โ Save current canvas as PNG |
| O | ๐ค | OCR โ Read text from canvas |
| U | ๐ | Update โ Reload config without restart |
| B / V | โ๏ธ | Brightness โ Increase / Decrease |
| C / X | ๐๏ธ | Contrast โ Increase / Decrease |
| ESC | โ | Quit โ Exit application |
Customize every aspect of Air Canvas by editing config.yaml:
# ๐ผ๏ธ Display Settings
window_width: 1280
window_height: 800
# ๐ง AI Settings
use_depth: true # Toggle depth estimation
depth_threshold: 0.6 # Sensitivity for "touch" detection
device: 'cuda' # Use 'cpu' if no GPU available
# ๐จ Drawing Tools
brush_sizes: [5, 10, 15]
eraser_sizes: [20, 40]
colors:
- [255, 255, 255] # White
- [0, 0, 0] # Black
- [255, 0, 0] # Red
- [0, 255, 0] # Green
- [0, 0, 255] # Blue
- [255, 255, 0] # Yellow
# ๐ฏ Gesture Sensitivity
multi_pinch_threshold: 60
multi_separation_threshold: 50
# ๐ Smoothing (Kalman Filter)
kalman_process_noise: 1e-3
kalman_measurement_noise: 0.1Tip
Press U while running to reload config changes without restarting the app!
๐ฆ Air-canvas-new
โฃ ๐ model_weights/ # โ ๏ธ Place depth model here
โ โ ๐ depth_anything_v2_vits.pth
โฃ ๐ depthtrack.py # ๐ Main application entry point
โฃ ๐ config.yaml # โ๏ธ User configuration
โฃ ๐ requirements.txt # ๐ฆ Python dependencies
โฃ ๐ผ๏ธ cursor.png # ๐ฏ UI Cursor asset
โฃ ๐ผ๏ธ draw.png # โ๏ธ Draw tool icon
โฃ ๐ผ๏ธ eraser.png # ๐งฝ Eraser tool icon
โฃ ๐ผ๏ธ line.png # ๐ Line tool icon
โฃ ๐ผ๏ธ rectangle.png # โฌ Rectangle tool icon
โฃ ๐ผ๏ธ circle.png # โญ Circle tool icon
โ ๐ README.md # ๐ This file!
sequenceDiagram
participant User as ๐ค User
participant Cam as ๐ท Webcam
participant MP as ๐ค MediaPipe
participant DA as ๐ง Depth-Anything
participant App as ๐จ Air Canvas
User->>Cam: Wave hand in air
Cam->>MP: Video frame
MP->>App: Hand landmarks (21 points)
Cam->>DA: Same frame
DA->>App: Depth map
App->>App: Calculate pinch gesture
App->>App: Check depth threshold
alt Pinching + Close to camera
App->>App: Draw on canvas
else Not pinching OR far from camera
App->>App: Move cursor only
end
App->>User: Display canvas + UI
| Algorithm | Purpose | Benefit |
|---|---|---|
| Catmull-Rom Spline | Smooth curve interpolation | Natural, flowing brush strokes |
| Kalman Filter | Position prediction & smoothing | Reduces hand-tracking jitter |
| Multi-Distance Pinch | Precise gesture detection | Prevents accidental draws |
Contributions make the open-source community an amazing place to learn, inspire, and create.
Any contributions you make are greatly appreciated!
- Fork the project
- Create your feature branch:
git checkout -b feature/AmazingFeature - Commit your changes:
git commit -m 'Add some AmazingFeature' - Push to the branch:
git push origin feature/AmazingFeature - Open a Pull Request
Distributed under the MIT License. See LICENSE for more information.
