Stereo Any Video:
Temporally Consistent Stereo Matching

Installation

Installation with cuda 12.2

Setup the root for all source files


    git clone https://github.com/tomtomtommi/stereoanyvideo
    cd stereoanyvideo
    export PYTHONPATH=`(cd ../ && pwd)`:`pwd`:$PYTHONPATH

Create a conda env


    conda create -n sav python=3.10
    conda activate sav

Install requirements


    conda install pytorch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 pytorch-cuda=12.1 -c pytorch -c nvidia
    pip install pip==24.0
    pip install pytorch_lightning==1.6.0
    pip install iopath
    conda install -c bottler nvidiacub
    pip install scikit-image matplotlib imageio plotly opencv-python
    conda install -c fvcore -c conda-forge fvcore
    pip install black usort flake8 flake8-bugbear flake8-comprehensions
    conda install pytorch3d -c pytorch3d
    pip install -r requirements.txt
    pip install timm

Download VDA checkpoints


    cd models/Video-Depth-Anything
    sh get_weights.sh

Inference a stereo video

sh demo.sh

Before running, download the checkpoints on google drive . Copy the checkpoints to ./checkpoints/

In default, left and right camera videos are supposed to be structured like this:

./demo_video/
        ├── left
            ├── left000000.png
            ├── left000001.png
            ├── left000002.png
            ...
        ├── right
            ├── right000000.png
            ├── right000001.png
            ├── right000002.png
            ...

A simple way to run the demo is using SouthKensingtonSV.

To test on your own data, modify --path ./demo_video/. More arguments can be found and modified in demo.py

Dataset

Download the following datasets and put in ./data/datasets/:

Evaluation

sh evaluate_stereoanyvideo.sh

Training

sh train_stereoanyvideo.sh

Citation

If you use our method in your research, please consider citing:

@inproceedings{jing2025stereo,
  title={Stereo any video: Temporally consistent stereo matching},
  author={Jing, Junpeng and Luo, Weixun and Mao, Ye and Mikolajczyk, Krystian},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={20836--20846},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
assets		assets
checkpoints		checkpoints
data/datasets		data/datasets
datasets		datasets
evaluation		evaluation
models		models
third_party/RAFT		third_party/RAFT
train_utils		train_utils
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
demo.sh		demo.sh
evaluate_stereoanyvideo.sh		evaluate_stereoanyvideo.sh
requirements.txt		requirements.txt
train_stereoanyvideo.py		train_stereoanyvideo.py
train_stereoanyvideo.sh		train_stereoanyvideo.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stereo Any Video:
Temporally Consistent Stereo Matching

Installation

Inference a stereo video

Dataset

Evaluation

Training

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Stereo Any Video: Temporally Consistent Stereo Matching

Installation

Inference a stereo video

Dataset

Evaluation

Training

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Stereo Any Video:
Temporally Consistent Stereo Matching

Packages