Robin3D: Improving 3D Large Language Model via Robust Instruction Tuning

We present Robin3D, a state-of-the-art 3D Large Language Model trained on large-scale instruction-following data generated by our novel Robust Instruction Generation (RIG) data engine. To handle our RIG-generated complex data, our Robin3D further enhances its spatial understanding by Relation-Augmented Projector and improves the object referring and grounding ability by ID-Feature Bonding.

News

[2024.09] We release Robin3D [paper][code], a new SOTA 3D LLM for 3D scenes.

🔥 Robin3D vs Previous Methods

🔨 Preparation

Prepare the environment:

conda create -n robin3d python=3.9.17
conda activate robin3d
conda install pytorch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install -r requirements.txt

Download LLM backbone:
- We use Vicuna-7B v1.5 in our experiments, which can be downloaded from Hugging Face.
Annotations and extracted features:

Please follow the instructions in Chat-Scene's Preparation.

🤖 Training and Inference

Coming soon.

📄 Citation

Our paper has disappeared from Google Scholar, and we don't know why. We have emailed the Google Scholar team but have not received a response yet.

If you find our work useful in your research, please consider citing:

@misc{kang2025robin3dimproving3dlarge,
      title={Robin3D: Improving 3D Large Language Model via Robust Instruction Tuning}, 
      author={Weitai Kang and Haifeng Huang and Yuzhang Shang and Mubarak Shah and Yan Yan},
      year={2025},
      eprint={2410.00255},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2410.00255}, 
}

Stay tuned for our project. 🔥

If you have any questions or suggestions, feel free to drop us an email ([email protected]) or open an issue.

😊 Acknowledgement

Thanks to the open source of the following projects:

LLMs: LLaMA, Vicuna,

3D Datasets: ScanNet, ScanRefer, ReferIt3D, Scan2Cap, ScanQA, SQA3D, Multi3dRefer, Grounded-3DLLM, Chat-Scene

Detectors: Mask3D,

Representations: Uni3D, DINOv2

3D Models: OpenScene

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Robin3D: Improving 3D Large Language Model via Robust Instruction Tuning

News

🔥 Robin3D vs Previous Methods

🔨 Preparation

🤖 Training and Inference

📄 Citation

😊 Acknowledgement

About

Releases

Packages

WeitaiKang/Robin3D

Folders and files

Latest commit

History

Repository files navigation

Robin3D: Improving 3D Large Language Model via Robust Instruction Tuning

News

🔥 Robin3D vs Previous Methods

🔨 Preparation

🤖 Training and Inference

📄 Citation

😊 Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages