|
1 | 1 | <p align="center">
|
2 | 2 | <h1 align="center">
|
3 |
| - GS$^3$LAM: Gaussian Semantic Splatting SLAM |
| 3 | + GS<sup>3</sup>LAM: Gaussian Semantic Splatting SLAM |
4 | 4 | <br>
|
5 | 5 | [ACM MM 2024]
|
6 | 6 | </h1>
|
| 7 | + <p align="center"> |
| 8 | + <a href="https://github.com/lif314"><strong>Linfei Li</strong></a> |
| 9 | + · |
| 10 | + <a href="https://scholar.google.com/citations?user=8VOk_S4AAAAJ&hl=en"><strong>Lin Zhang*</strong></a> |
| 11 | + · |
| 12 | + <a href="https://scholar.google.com/citations?user=rrkp_usAAAAJ&hl=en"><strong>Zhong Wang</strong></a> |
| 13 | + · |
| 14 | + <a href="https://scholar.google.com/citations?user=A0N_mS0AAAAJ&hl=en"><strong>Ying Shen</strong></a> |
| 15 | +</p> |
7 | 16 |
|
8 |
| - <h3 align="center"><a href="https://github.com/lif314/GS3LAM">🌐Project page (Comming soon)</a> | <a href="https://github.com/lif314/GS3LAM">📝Paper (Comming soon)</a></h3> |
| 17 | + <h3 align="center"><a href="https://github.com/lif314/GS3LAM">🌐Project page (comming soon)</a> |
| 18 | + | <a href="https://dl.acm.org/doi/10.1145/3664647.3680739">📝Paper(ACM DL)</a> |
| 19 | + </h3> |
9 | 20 | <div align="center"></div>
|
10 | 21 | </p>
|
11 | 22 |
|
|
15 | 26 | </a>
|
16 | 27 | </p>
|
17 | 28 |
|
18 |
| -<p align="center"> |
| 29 | +<!-- <p align="center"> |
19 | 30 | <a href="">
|
20 | 31 | <img src="./assets/splatting_rendering.gif" alt="splatting" width="100%">
|
21 | 32 | </a>
|
|
28 | 39 | <a href="">
|
29 | 40 | <img src="./assets/r0_pointslam_ours.gif" alt="r0_pointslam_ours" width="100%">
|
30 | 41 | </a>
|
31 |
| -</p> |
| 42 | +</p> --> |
| 43 | + |
| 44 | +<!-- TABLE OF CONTENTS --> |
| 45 | +<details open="open" style='padding: 10px; border-radius:5px 30px 30px 5px; border-style: solid; border-width: 1px;'> |
| 46 | + <summary>Table of Contents</summary> |
| 47 | + <ol> |
| 48 | + <li> |
| 49 | + <a href="#installation">Installation</a> |
| 50 | + </li> |
| 51 | + <li> |
| 52 | + <a href="#datasets">Datasets</a> |
| 53 | + </li> |
| 54 | + <li> |
| 55 | + <a href="#benchmarking">Benchmarking</a> |
| 56 | + </li> |
| 57 | + <li> |
| 58 | + <a href="#visualizer">Visualizer</a> |
| 59 | + </li> |
| 60 | + <li> |
| 61 | + <a href="#acknowledgement">Acknowledgement</a> |
| 62 | + </li> |
| 63 | + <li> |
| 64 | + <a href="#citation">Citation</a> |
| 65 | + </li> |
| 66 | + </ol> |
| 67 | +</details> |
| 68 | + |
| 69 | +## Installation |
| 70 | + |
| 71 | +The simplest way to install all dependences is to use [anaconda](https://www.anaconda.com/) and [pip](https://pypi.org/project/pip/) in the following steps: |
| 72 | + |
| 73 | +```bash |
| 74 | +conda create -n gs3lam python==3.10 |
| 75 | +conda activate gs3lam |
| 76 | +conda install -c "nvidia/label/cuda-11.7.0" cuda-toolkit |
| 77 | +pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117 |
| 78 | +pip install -r requirements.txt |
| 79 | + |
| 80 | + |
| 81 | +# install Gaussian Rasterization |
| 82 | +pip install submodules/gaussian-semantic-rasterization |
| 83 | +``` |
| 84 | + |
| 85 | +## Datasets |
| 86 | + |
| 87 | +DATAROOT is `./data` by default. Please change the `basedir` path in the scene-specific config files if datasets are stored somewhere else on your machine. |
| 88 | + |
| 89 | +### Replica |
| 90 | + |
| 91 | +The original Replica dataset does not contain semantic labels. We obtained semantic labels from [vMAP](https://github.com/kxhit/vMAP). You can download our generated semantic Replica dataset from [here](https://huggingface.co/datasets/3David14/GS3LAM-Replica), then place the data into the `./data/Replica` folder. |
| 92 | + |
| 93 | +> Note, if you directly use the Replica dataset provided by vMAP, please modify the [Replica Dataloader](./src/datasets/replica.py) and the [png_depth_scale](./configs/camera/replica.yaml) parameter in config files. |
| 94 | +
|
| 95 | +### TUM-RGBD |
| 96 | + |
| 97 | +<!-- 由于TUM-RGBD没有真值语义标签,该数据集并不是我们的评估数据集。不过,为了测试我们方法的有效性,我们使用DEVA生成伪语义标签,您可以从这下载带有语义标签的TUM-RGBD。不幸的是,现有语义分割模型难以保证长序列数据的帧间语义一致性,因此我们只在fr1序列上进行了测试。 --> |
| 98 | +The TUM-RGBD dataset does not have ground truth semantic labels, so it is not our evaluation dataset. However, in order to evalute the effectiveness of GS3LAM, we use pseudo-semantic labels generated by [DEVA](https://github.com/hkchengrex/Tracking-Anything-with-DEVA), which you can download from [here](https://huggingface.co/datasets/3David14/TUM-DEVA). Unfortunately, existing semantic segmentation models struggle to maintain inter-frame semantic consistency in long sequence data, so we only tested on the `freiburg1_desk` sequence. |
| 99 | + |
| 100 | +### ScanNet |
| 101 | + |
| 102 | +Please follow the data downloading procedure on the [ScanNet](http://www.scan-net.org/) website, and extract color/depth frames from the `.sens` file using this [code](https://github.com/ScanNet/ScanNet/blob/master/SensReader/python/reader.py). |
| 103 | + |
| 104 | +<details> |
| 105 | + <summary>[Directory structure of ScanNet (click to expand)]</summary> |
| 106 | + |
| 107 | +``` |
| 108 | + DATAROOT |
| 109 | + └── scannet |
| 110 | + └── scene0000_00 |
| 111 | + └── frames |
| 112 | + ├── color |
| 113 | + │ ├── 0.jpg |
| 114 | + │ ├── 1.jpg |
| 115 | + │ └── ... |
| 116 | + ├── depth |
| 117 | + │ ├── 0.png |
| 118 | + │ ├── 1.png |
| 119 | + │ └── ... |
| 120 | + ├── label-filt |
| 121 | + │ ├── 0.png |
| 122 | + │ ├── 1.png |
| 123 | + │ └── ... |
| 124 | + ├── intrinsic |
| 125 | + └── pose |
| 126 | + ├── 0.txt |
| 127 | + ├── 1.txt |
| 128 | + └── ... |
| 129 | +``` |
| 130 | +</details> |
| 131 | + |
| 132 | + |
| 133 | +We use the following sequences: |
| 134 | +``` |
| 135 | +scene0000_00 |
| 136 | +scene0059_00 |
| 137 | +scene0106_00 |
| 138 | +scene0169_00 |
| 139 | +scene0181_00 |
| 140 | +scene0207_00 |
| 141 | +``` |
| 142 | + |
| 143 | +## Benchmarking |
| 144 | +### TUM-RGBD |
| 145 | + |
| 146 | +To run GS3LAM on the `freiburg1_desk` scene, run the following command: |
| 147 | + |
| 148 | +```bash |
| 149 | +python run.py configs/Tum/tum_fr1.py |
| 150 | +``` |
| 151 | + |
| 152 | +### Replica |
| 153 | + |
| 154 | +To run GS3LAM on the `office0` scene, run the following command: |
| 155 | + |
| 156 | +```bash |
| 157 | +python run.py configs/Replica/office0.py |
| 158 | +``` |
| 159 | + |
| 160 | +To run GS3LAM on all Replica scenes, run the following command: |
| 161 | + |
| 162 | +```bash |
| 163 | +bash scripts/eval_full_replica.sh |
| 164 | +``` |
| 165 | + |
| 166 | +### ScanNet |
| 167 | + |
| 168 | +To run GS3LAM on the `scene0059_00` scene, run the following command: |
| 169 | + |
| 170 | +```bash |
| 171 | +python run.py configs/Scannet/scene0059_00.py |
| 172 | +``` |
| 173 | + |
| 174 | +To run GS3LAM on all ScanNet scenes, run the following command: |
| 175 | + |
| 176 | +```bash |
| 177 | +bash scripts/eval_full_scannet.bash |
| 178 | +``` |
| 179 | + |
| 180 | +## Visualizer |
| 181 | + |
| 182 | +``` |
| 183 | +TBD |
| 184 | +``` |
| 185 | + |
| 186 | +## Acknowledgement |
| 187 | +We thank the authors of the following repositories for their open-source code: |
| 188 | + |
| 189 | +- [3D Gaussian Splatting](https://github.com/graphdeco-inria/gaussian-splatting) |
| 190 | +- [SplaTAM](https://github.com/spla-tam) |
| 191 | +- [Gaussian-SLAM](https://github.com/VladimirYugay/Gaussian-SLAM) |
| 192 | +- [vMAP](https://github.com/kxhit/vMAP) |
| 193 | +- [Point-SLAM](https://github.com/eriksandstroem/Point-SLAM) |
| 194 | +- [Gaussian Grouping](https://github.com/lkeab/gaussian-grouping) |
| 195 | + |
| 196 | +## Citation |
32 | 197 |
|
33 |
| -# TODO |
| 198 | +If you find our paper and code useful for your research, please use the following BibTeX entry. |
34 | 199 |
|
35 |
| -- [ ] Code release. |
| 200 | +```bibtex |
| 201 | +@inproceedings{li2024gs3lam, |
| 202 | + author = {Li, Linfei and Zhang, Lin and Wang, Zhong and Shen, Ying}, |
| 203 | + title = {GS3LAM: Gaussian Semantic Splatting SLAM}, |
| 204 | + year = {2024}, |
| 205 | + publisher = {Association for Computing Machinery}, |
| 206 | + address = {New York, NY, USA}, |
| 207 | + booktitle = {Proceedings of the 32nd ACM International Conference on Multimedia}, |
| 208 | + pages = {3019–3027}, |
| 209 | + numpages = {9}, |
| 210 | + location = {Melbourne VIC, Australia}, |
| 211 | + series = {MM '24} |
| 212 | +} |
| 213 | +``` |
0 commit comments