Skip to content
/ HART Public

The official implementation of "Hadamard Attention Recurrent Transformer: A Strong Baseline for Stereo Matching Transformer"

Notifications You must be signed in to change notification settings

ZYangChen/HART

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HART

The official implementation of "Hadamard Attention Recurrent Transformer: A Strong Baseline for Stereo Matching Transformer"

   

Hadamard Attention Recurrent Transformer: A Strong Baseline for Stereo Matching Transformer
Ziyang Chen,Wenting Li, Yongjun Zhang✱, Bingshu Wang, Yabo Wu, Yong Zhao, C. L. Philip Chen
arXiv Report
Contact us: [email protected]; [email protected]

@article{chen2025hart,
  title={Hadamard Attention Recurrent Transformer: A Strong Baseline for Stereo Matching Transformer},
  author={Chen, Ziyang and Zhang, Yongjun and Li, Wenting and Wang, Bingshu and Wu, Yabo and Zhao, Yong and Chen, CL},
  journal={arXiv preprint arXiv:2501.01023},
  year={2025}
}

Requirements

Python = 3.8

CUDA = 11.3

conda create -n hart python=3.8
conda activate hart
pip install torch==1.12.0+cu113 torchvision==0.13.0+cu113 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu113
pip install -r requirements.txt

Dataset

To evaluate/train our HART, you will need to download the required datasets.

By default stereo_datasets.py will search for the datasets in these locations. You can create symbolic links to wherever the datasets were downloaded in the datasets folder

├── datasets
    ├── FlyingThings3D
        ├── frames_finalpass
        ├── disparity
    ├── Monkaa
        ├── frames_finalpass
        ├── disparity
    ├── Driving
        ├── frames_finalpass
        ├── disparity
    ├── KITTI
        ├── KITTI_2015
        	├── testing
	        ├── training
        ├── KITTI_2012
        	├── testing
		├── training
    ├── Middlebury
        ├── MiddEval3
		├── trainingF
		├── trainingH
		├── trainingQ
	├── official_train.txt
        ├── 2005
        ├── 2006
        ├── 2014
        ├── 2021
    ├── ETH3D
        ├── two_view_training
        ├── two_view_training_gt
        ├── two_view_testing
    ├── TartanAir
    ├── fat
    ├── crestereo
    ├── HR-VS
        ├── carla-highres
    ├── InStereo2K

"official_train.txt" is available at here.

Training

bash ./scripts/train.sh

Evaluation

To evaluate a trained model on a validation set (e.g. Middlebury full resolution), run

python evaluate_stereo.py --restore_ckpt models/hart_sceneflow.pth --dataset middlebury_F

Weight is available here.

Acknowledgements

  • This project borrows the code from STTR, DLNR, IGEV, MoCha-Stereo. We thank the original authors for their excellent works!
  • This project is supported by Science and Technology Planning Project of Guizhou Province, Department of Science and Technology of Guizhou Province, China (QianKeHe[2024]Key001).
  • This project is supported by Science and Technology Planning Project of Guizhou Province, Department of Science and Technology of Guizhou Province, China (Project No. [2023]159).

About

The official implementation of "Hadamard Attention Recurrent Transformer: A Strong Baseline for Stereo Matching Transformer"

Resources

Stars

Watchers

Forks

Packages

No packages published