-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
39 changed files
with
6,031 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,148 @@ | ||
data/* | ||
slurm/ | ||
wandb/ | ||
lightning_logs/ | ||
runs/ | ||
runs | ||
|
||
*.pyd | ||
|
||
# Editors | ||
.vscode/ | ||
.idea/ | ||
|
||
# Vagrant | ||
.vagrant/ | ||
|
||
# Mac/OSX | ||
.DS_Store | ||
|
||
# Windows | ||
Thumbs.db | ||
|
||
# Source for the following rules: https://raw.githubusercontent.com/github/gitignore/master/Python.gitignore | ||
# Byte-compiled / optimized / DLL files | ||
__pycache__/ | ||
*.py[cod] | ||
*$py.class | ||
|
||
*.out | ||
# C extensions | ||
*.so | ||
|
||
# Distribution / packaging | ||
.Python | ||
build/ | ||
develop-eggs/ | ||
dist/ | ||
downloads/ | ||
eggs/ | ||
.eggs/ | ||
lib/ | ||
lib64/ | ||
parts/ | ||
sdist/ | ||
var/ | ||
wheels/ | ||
*.egg-info/ | ||
.installed.cfg | ||
*.egg | ||
MANIFEST | ||
|
||
# PyInstaller | ||
# Usually these files are written by a python script from a template | ||
# before PyInstaller builds the exe, so as to inject date/other infos into it. | ||
*.manifest | ||
*.spec | ||
|
||
# Installer logs | ||
pip-log.txt | ||
pip-delete-this-directory.txt | ||
|
||
# Unit test / coverage reports | ||
htmlcov/ | ||
.tox/ | ||
.nox/ | ||
.coverage | ||
.coverage.* | ||
.cache | ||
nosetests.xml | ||
coverage.xml | ||
*.cover | ||
.hypothesis/ | ||
.pytest_cache/ | ||
|
||
# Translations | ||
*.mo | ||
*.pot | ||
|
||
# Django stuff: | ||
*.log | ||
local_settings.py | ||
db.sqlite3 | ||
|
||
# Flask stuff: | ||
instance/ | ||
.webassets-cache | ||
|
||
# Scrapy stuff: | ||
.scrapy | ||
|
||
# Sphinx documentation | ||
docs/_build/ | ||
|
||
# PyBuilder | ||
target/ | ||
|
||
# Jupyter Notebook | ||
.ipynb_checkpoints | ||
|
||
# IPython | ||
profile_default/ | ||
ipython_config.py | ||
|
||
# pyenv | ||
.python-version | ||
|
||
# celery beat schedule file | ||
celerybeat-schedule | ||
|
||
# SageMath parsed files | ||
*.sage.py | ||
|
||
# Environments | ||
.env | ||
.venv | ||
env/ | ||
venv/ | ||
ENV/ | ||
env.bak/ | ||
venv.bak/ | ||
|
||
# Spyder project settings | ||
.spyderproject | ||
.spyproject | ||
|
||
# Rope project settings | ||
.ropeproject | ||
|
||
# mkdocs documentation | ||
/site | ||
|
||
# mypy | ||
.mypy_cache/ | ||
.dmypy.json | ||
dmypy.json | ||
/experiments/ | ||
/pretrained-examples/ | ||
|
||
test.ipynb | ||
debug/* | ||
checkpoints/* | ||
outputs/* | ||
.hydra/* | ||
*.mp4 | ||
*.gif | ||
test.py | ||
*.slurm | ||
z/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,94 @@ | ||
# SlotLifter | ||
Code for "SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields" (ECCV 2024) | ||
<p align="left"> | ||
<a href='https://arxiv.org/abs/2408.06697'> | ||
<img src='https://img.shields.io/badge/Paper-arXiv-green?style=plastic&logo=arXiv&logoColor=green' alt='Paper arXiv'> | ||
</a> | ||
<a href='https://arxiv.org/pdf/2408.06697'> | ||
<img src='https://img.shields.io/badge/Paper-PDF-red?style=plastic&logo=adobeacrobatreader&logoColor=red' alt='Paper PDF'> | ||
</a> | ||
<a href='https://slotlifter.github.io/'> | ||
<img src='https://img.shields.io/badge/Project-Page-blue?style=plastic&logo=Google%20chrome&logoColor=blue' alt='Project Page'> | ||
</a> | ||
</p> | ||
|
||
# Coming Soon | ||
This repository contains the official implementation of the ECCV 2024 paper: | ||
|
||
[SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields](https://arxiv.org/abs/2408.06697) | ||
|
||
[YuLiu](https://yuliu-ly.github.io)\*,[Baoxiong Jia](https://buzz-beater.github.io)\*,[Yixin Chen](https://yixchen.github.io), [Siyuan Huang](https://siyuanhuang.com) | ||
<br> | ||
<p align="center"> | ||
<img src="assets/overview.png"> </img> | ||
</p> | ||
|
||
## Environment Setup | ||
We provide all environment configurations in ``requirements.txt``. To install all packages, you can create a conda environment and install the packages as follows: | ||
```bash | ||
conda create -n slotlifter python=3.8 | ||
conda activate slotlifter | ||
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia | ||
pip install -r requirements.txt | ||
``` | ||
In our experiments, we used NVIDIA CUDA 11.3 on Ubuntu 20.04. Similar CUDA version should also be acceptable with corresponding version control for ``torch`` and ``torchvision``. | ||
|
||
## Dataset | ||
### 1. ShapeStacks, ObjectsRoom, CLEVRTex, Flowers | ||
Download ShapeStacks, ObjectsRoom, CLEVRTex and Flowers datasets with | ||
```bash | ||
chmod +x scripts/downloads_data.sh | ||
./downloads_data.sh | ||
``` | ||
For ObjectsRoom dataset, you need to run ``objectsroom_process.py`` to save the tfrecords dataset as a png format. | ||
Remember to change the ``DATA_ROOT`` in ``downloads_data.sh`` and ``objectsroom_process.py`` to your own paths. | ||
### 2. PTR, Birds, Dogs, Cars | ||
Download PTR dataset following instructions from http://ptr.csail.mit.edu. Download CUB-Birds, Stanford Dogs, and Cars datasets from [here](https://drive.google.com/drive/folders/1zEzsKV2hOlwaNRzrEXc9oGdpTBrrVIVk), provided by authors from [DRC](https://github.com/yuPeiyu98/DRC). We use the ```birds.zip```, ```cars.tar``` and ```dogs.zip``` and then uncompress them. | ||
|
||
### 4. YCB, ScanNet, COCO | ||
YCB, ScanNet and COCO datasets are available from [here](https://www.dropbox.com/sh/u1p1d6hysjxqauy/AACgEh0K5ANipuIeDnmaC5mQa?dl=0), provided by authors from [UnsupObjSeg](https://github.com/vLAR-group/UnsupObjSeg). | ||
|
||
### 5. Data preparation | ||
Please organize the data following [here](./data/README.md) before experiments. | ||
|
||
## Training | ||
|
||
To train the model from scratch we provide the following model files: | ||
- ``train_trans_dec.py``: transformer-based model | ||
- ``train_mixture_dec.py``: mixture-based model | ||
- ``train_base_sa.py``: original slot-attention | ||
We provide training scripts under ``scripts/train``. Please use the following command and change ``.sh`` file to the model you want to experiment with. Take the transformer-based decoder experiment on Birds as an exmaple, you can run the following: | ||
```bash | ||
$ cd scripts | ||
$ cd train | ||
$ chmod +x trans_dec_birds.sh | ||
$ ./trans_dec_birds.sh | ||
``` | ||
Remember to change the paths in ``path.json`` to your own paths. | ||
## Reloading checkpoints & Evaluation | ||
|
||
To reload checkpoints and only run inference, we provide the following model files: | ||
- ``test_trans_dec.py``: transformer-based model | ||
- ``test_mixture_dec.py``: mixture-based model | ||
- ``test_base_sa.py``: original slot-attention | ||
|
||
Similarly, we provide testing scripts under ```scripts/test```. We provide transformer-based model for real-world datasets (Birds, Dogs, Cars, Flowers, YCB, ScanNet, COCO) | ||
and mixture-based model for synthetic datasets(ShapeStacks, ObjectsRoom, ClevrTex, PTR). We provide all checkpoints [here](https://drive.google.com/drive/folders/10LmK9JPWsSOcezqd6eLjuzn38VdwkBUf?usp=sharing). Please use the following command and change ``.sh`` file to the model you want to experiment with: | ||
```bash | ||
$ cd scripts | ||
$ cd test | ||
$ chmod +x trans_dec_birds.sh | ||
$ ./trans_dec_birds.sh | ||
``` | ||
|
||
## Citation | ||
If you find our paper and/or code helpful, please consider citing: | ||
``` | ||
@inproceedings{Liu2024slotlifter, | ||
title={SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields}, | ||
author={Liu, Yu and Jia, Baoxiong and Chen, Yixin and Huang, Siyuan}, | ||
booktitle={European Conference on Computer Vision (ECCV)}, | ||
year={2024} | ||
} | ||
``` | ||
|
||
## Acknowledgement | ||
This code heavily used resources from [PanopticLifting](https://github.com/nihalsid/panoptic-lifting), [BO-QSA](https://github.com/YuLiu-LY/BO-QSA), [SLATE](https://github.com/singhgautam/slate), [OSRT](https://github.com/stelzner/osrt), [IBRNet](https://github.com/googleinterns/IBRNet). We thank the authors for open-sourcing their awesome projects. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,107 @@ | ||
# Wandb | ||
project: "dtu" # project name | ||
exp_name: test # experiment name | ||
entity: # username or teamname where you're sending runs | ||
group: # experiment groupname | ||
job_type: debug # train / test / debug ... | ||
tags: # tags for this run | ||
id: # unique Id for this run | ||
notes: # notes for this run | ||
watch_model: false # true for logging the gradient of parameters | ||
# Training | ||
lpips_net: vgg | ||
sample_mode: uniform | ||
bg_bound: 0.27 | ||
force_bg: false | ||
force_bg_steps: 30000 | ||
normalize: true | ||
benchmark: false | ||
deterministic: false | ||
render_src_view: false | ||
profiler: # use profiler to check time bottleneck | ||
resume: null | ||
ckpt_path: '' | ||
logger: # wandb or None | ||
log_path: "runs/dtu" | ||
chunk: 8192 # num of rays per chunk | ||
num_workers: 0 | ||
seed: 42 | ||
val_percent: 1.0 # val_batchs = val_check_percent * val_batchs if val_batchs < 1 else val_batchs | ||
train_percent: 1.0 | ||
test_percent: 1.0 | ||
val_check_interval: 4 # do validation every val_check_interval epochs. It could be less than 1 | ||
grad_clip: 0.5 | ||
precision: 32 # compute precision | ||
instance_steps: 500000 | ||
stop_semantic_grad: false | ||
decay_noise: 20000 | ||
seg_metrics: | ||
- ari | ||
- hiou | ||
- ari_fg | ||
# Optimizer | ||
optimizer: lion | ||
lr: 5e-5 | ||
min_lr_factor: 0.02 | ||
weight_decay: 0.001 | ||
warmup_steps: 10000 | ||
max_steps: 250000 | ||
max_epochs: 10000 | ||
decay_steps: 50000 | ||
# Dataset | ||
norm_scene: false | ||
select_view_func: nearby # or uniform | ||
load_mask: false | ||
img_size: | ||
- 300 | ||
- 400 | ||
train_subsample_frames: 1 | ||
val_subsample_frames: 5 | ||
num_src_view: 4 | ||
batch_size: 2 | ||
ray_batchsize: 1024 | ||
max_instances: 20 | ||
dataset: dtu | ||
dataset_root: /home/yuliu/Dataset/DTU | ||
instance_dir: instance | ||
semantics_dir: semantics | ||
# dataset_root: data/hypersim/ai_001_008 | ||
max_depth: 10 | ||
visualized_indices: null | ||
overfit: false | ||
# Model | ||
# Multi-view enc | ||
feature_size: 32 | ||
num_heads: 4 | ||
conv_enc: false | ||
conv_dim: 32 | ||
# slot_enc | ||
sigma_steps: 30000 | ||
num_slots: 2 | ||
num_iter: 3 | ||
slot_size: 256 | ||
drop_path: 0.2 | ||
num_blocks: 1 | ||
# slot_dec | ||
slot_density: true | ||
slot_dec_dim: 64 | ||
num_dec_blocks: 4 | ||
# NeRF | ||
random_proj_ratio: 1 | ||
random_proj: true | ||
random_proj_steps: 30000 | ||
n_samples: 64 | ||
n_samples_fine: 64 | ||
coarse_to_fine: true | ||
pe_view: 2 | ||
pe_feat: 0 | ||
grid_init: pos_enc | ||
monitor: psnr | ||
nerf_mlp_dim: 64 | ||
suffix: '' | ||
scene_id: -1 | ||
num_vis: 30 | ||
hydra: | ||
output_subdir: null # Disable saving of config files. We'll do that ourselves. | ||
run: | ||
dir: . |
Oops, something went wrong.