HART

The official implementation of "Hadamard Attention Recurrent Transformer: A Strong Baseline for Stereo Matching Transformer"

Hadamard Attention Recurrent Transformer: A Strong Baseline for Stereo Matching Transformer
Ziyang Chen,Wenting Li, Yongjun Zhang✱, Bingshu Wang, Yabo Wu, Yong Zhao, C. L. Philip Chen
arXiv Report
Contact us: [email protected]; [email protected]✱

@article{chen2025hart,
  title={Hadamard Attention Recurrent Transformer: A Strong Baseline for Stereo Matching Transformer},
  author={Chen, Ziyang and Zhang, Yongjun and Li, Wenting and Wang, Bingshu and Wu, Yabo and Zhao, Yong and Chen, CL},
  journal={arXiv preprint arXiv:2501.01023},
  year={2025}
}

Requirements

Python = 3.8

CUDA = 11.3

conda create -n hart python=3.8
conda activate hart
pip install torch==1.12.0+cu113 torchvision==0.13.0+cu113 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu113
pip install -r requirements.txt

Dataset

To evaluate/train our HART, you will need to download the required datasets.

Sceneflow (Includes FlyingThings3D, Driving, Monkaa)
Middlebury
ETH3D
KITTI
TartanAir
Falling Things (fat.zip)
CARLA
CREStereo Dataset
InStereo2K
Sintel Stereo
ETH3D

By default stereo_datasets.py will search for the datasets in these locations. You can create symbolic links to wherever the datasets were downloaded in the datasets folder

├── datasets
    ├── FlyingThings3D
        ├── frames_finalpass
        ├── disparity
    ├── Monkaa
        ├── frames_finalpass
        ├── disparity
    ├── Driving
        ├── frames_finalpass
        ├── disparity
    ├── KITTI
        ├── KITTI_2015
        	├── testing
	        ├── training
        ├── KITTI_2012
        	├── testing
		├── training
    ├── Middlebury
        ├── MiddEval3
		├── trainingF
		├── trainingH
		├── trainingQ
	├── official_train.txt
        ├── 2005
        ├── 2006
        ├── 2014
        ├── 2021
    ├── ETH3D
        ├── two_view_training
        ├── two_view_training_gt
        ├── two_view_testing
    ├── TartanAir
    ├── fat
    ├── crestereo
    ├── HR-VS
        ├── carla-highres
    ├── InStereo2K

"official_train.txt" is available at here.

Training

bash ./scripts/train.sh

Evaluation

To evaluate a trained model on a validation set (e.g. Middlebury full resolution), run

python evaluate_stereo.py --restore_ckpt models/hart_sceneflow.pth --dataset middlebury_F

Weight is available here.

Acknowledgements

This project borrows the code from STTR, DLNR, IGEV, MoCha-Stereo. We thank the original authors for their excellent works!
This project is supported by Science and Technology Planning Project of Guizhou Province, Department of Science and Technology of Guizhou Province, China (QianKeHe[2024]Key001).
This project is supported by Science and Technology Planning Project of Guizhou Province, Department of Science and Technology of Guizhou Province, China (Project No. [2023]159).

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
module		module
scripts		scripts
README.md		README.md
evaluate_stereo.py		evaluate_stereo.py
hart-poster.png		hart-poster.png
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HART

Requirements

Dataset

Training

Evaluation

Acknowledgements

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

ZYangChen/HART

Folders and files

Latest commit

History

Repository files navigation

HART

Requirements

Dataset

Training

Evaluation

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages