Transform your images into depth maps using AI! This tool analyzes photos and creates grayscale depth maps showing how far objects are from the camera.
This Python script takes regular photos and generates depth maps - images where brightness represents distance:
- Bright/White pixels = Objects close to the camera
- Dark/Black pixels = Objects far from the camera
Input Image: Depth Map Output:
ποΈ Mountain (far) β π€ Dark Gray
π³ Trees (medium) β βͺ Light Gray
π€ Person (close) β β¬ White
First, install the necessary Python packages:
pip install -r requirements.txtOr manually:
pip install torch torchvision opencv-python matplotlibOpen generate_depth.py and edit these lines:
dataset_folder = r"C:\your\images\folder" # Your input images
output_folder = r"C:\your\output\folder" # Where to save depth mapspython generate_depth.pyThe script will:
- Find all images in your input folder (including subfolders)
- Process each image using the MiDaS AI model
- Save depth maps with the same filename and folder structure
- Show progress in the terminal
π Input Folder π Output Folder
βββ photo1.jpg β βββ photo1.png (depth map)
βββ photo2.png β βββ photo2.png (depth map)
βββ π vacation βββ π vacation
βββ beach.jpg β βββ beach.png (depth map)
βββ sunset.jpg β βββ sunset.png (depth map)
Key Points:
- β Preserves your folder structure
- β
Converts all formats to
.png - β Same filename (original extension replaced)
- β Normalized to 0-255 grayscale values
The script supports two AI models. Choose based on your needs:
| Feature | MiDaS_small (Current) | DPT_Large |
|---|---|---|
| Speed | β‘ Very Fast | π’ Slower (3-5x) |
| Accuracy | β Good | π― Excellent |
| GPU Memory | π Low (~2GB) | π΄ High (~8GB) |
| Best For | Batch processing thousands of images | High-quality, detailed depth maps |
| File Size | π¦ ~100MB | π¦ ~1.2GB |
Change line 11 in generate_depth.py:
# For speed:
model_type = "MiDaS_small"
# For accuracy:
model_type = "DPT_Large"
# For balance:
model_type = "DPT_Hybrid"- 1000 images with MiDaS_small: ~15-20 minutes (with GPU)
- 1000 images with DPT_Large: ~60-90 minutes (with GPU)
Want to make this tool even better? Here are some ideas:
- Add a progress bar (using
tqdmlibrary) - Create a GUI interface (using
tkinterorgradio) - Add batch size options for faster processing
- Include a config file instead of hardcoded paths
- Add colored depth maps (heatmap visualization)
- Support video depth map generation
- Add depth map quality comparison tool
- Implement multi-GPU support for faster processing
- Create before/after preview window
- Add command-line arguments (argparse)
- 3D point cloud generation from depth maps
- Depth map refinement using multiple models
- Real-time webcam depth estimation
- Export to 3D formats (.obj, .ply)
- API/web service deployment
- Fork this repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Solution: Switch to MiDaS_small or process fewer images at once
Solution: Check if the image file is corrupted or in an unsupported format
Solution: Install CUDA-enabled PyTorch: https://pytorch.org/get-started/locally/
Solution: Make sure all dependencies are installed:
pip install --upgrade torch torchvision opencv-python matplotlibThis project uses the MiDaS model, which is released under the MIT license. Please refer to the MiDaS repository for more details.
- Intel ISL for creating the MiDaS depth estimation model
- PyTorch for the deep learning framework
- OpenCV for image processing capabilities
- Check the MiDaS GitHub Issues
- Review PyTorch documentation for GPU setup
- Ensure your paths use raw strings (r"path\to\folder") on Windows
