A multi-task learning framework designed for simultaneous depth estimation and semantic segmentation using the Swin Transformer architecture.
- [30th June] Paper Accepted at the IROS 2024 Conference 🔥🔥🔥
To get started, follow these steps:
-
Only for ROS installation (otherwise skip this part)
cd catkin_ws/src catkin_create_pkg SwinMTL_ROS std_msgs rospy cd .. catkin_make source devel/setup.bash cd src/SwinMTL_ROS/src git clone https://github.com/PardisTaghavi/SwinMTL.git chmod +x inference_ros.py mv ./launch/ ./..
-
Clone the repository:
git clone https://github.com/PardisTaghavi/SwinMTL.git cd SwinMTL
-
Create a conda environment and activate it:
conda env create --file environment.yml conda activate prc
To run the testing for the project, follow the below steps:
-
Download Pretrained Models:
- here access the pretrained models.
- Download the pretrained models you need.
-
Move Pretrained Models:
- Create a new folder named
model_zoo
and move the pretrained models into the model_zoo folder you created in the project directory. - Refer to
testLive.ipynb
for testing.
- Create a new folder named
roslaunch SwinMTL_ROS swinmtl_launch.launch
- Introduction of a multi-task learning approach for joint depth estimation and semantic segmentation.
- Achievement of state-of-the-art performance on Cityscapes and NYUv2 datasets.
- Utilization of an efficient shared encoder-decoder architecture coupled with novel techniques to enhance accuracy.
We welcome feedback and contributions to the SwinMTL project. Feel free to contact [email protected].
Special thanks to the authors of the following projects for laying the foundation of this work. Our code relies on: