Skip to content

[ICLR 2026] PyTorch implementation of "The Less You Depend, The More You Learn: Synthesizing Novel Views from Sparse, Unposed Images without Any 3D Knowledge".

Notifications You must be signed in to change notification settings

ou524u/Less3Depend

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Less3Depend (ICLR 2026)

This repository contains the PyTorch implementation of the paper "The Less You Depend, The More You Learn: Synthesizing Novel Views from Sparse, Unposed Images without Any 3D Knowledge".

1. Preparation

Environment Setup

Create and activate conda environment:

conda create -n lvsm python=3.11
conda activate lvsm
pip install -r requirements.txt

Recommended: GPU device with compute capability > 8.0. We used 8*A100 GPUs in our experiments.

Dataset Setup

Update(26/01/04): We now also provide our preprocessed version of DL3DV dataset in pixelSplat-style format on huggingface!

We use RealEstate10K dataset from pixelSplat, and followed LVSM to do the preprocessing.

Download and unzip RealEstate10K .torch chunks. For our scaling experiments, we split the dataset into 4 sizes, each containing the number of chunks listed below:

Size Chunks Scenes
Little 76 1,202
Medium 304 4,121
Large 1,216 16,449
Full 4,866 66,033

Process the dataset following LVSM:

# process training split
python process_data.py --base_path datasets/re10k --output_dir datasets/re10k-full_processed --mode train --num_processes 80

# process test split
python process_data.py --base_path datasets/re10k --output_dir datasets/re10k-full_processed --mode test --num_processes 80

2. Evaluation

Download pre-trained model from Hugging Face.

Run evaluation:

# fast inference, compute metrics only
torchrun --nproc_per_node 8 --nnodes 1 --rdzv_id 18640 --rdzv_backend c10d --rdzv_endpoint localhost:29511 -m src.inference_fast --config config/eval/uplvsm_x224.yaml

# complete inference
torchrun --nproc_per_node 8 --nnodes 1 --rdzv_id 18640 --rdzv_backend c10d --rdzv_endpoint localhost:29511 -m src.inference --config config/eval/uplvsm_x224.yaml

✅ Download uplvsm model with 518×518 resolution from Hugging Face, and run evaluation:

# fast inference, compute metrics only
torchrun --nproc_per_node 8 --nnodes 1 --rdzv_id 18640 --rdzv_backend c10d --rdzv_endpoint localhost:29511 -m src.inference_fast --config config/eval/uplvsm_x518.yaml

# complete inference
torchrun --nproc_per_node 8 --nnodes 1 --rdzv_id 18640 --rdzv_backend c10d --rdzv_endpoint localhost:29511 -m src.inference --config config/eval/uplvsm_x518.yaml

3. Training

# pretraining on 224×224 resolution
torchrun --nproc_per_node 8 --nnodes 1 --rdzv_id 18640 --rdzv_backend c10d --rdzv_endpoint localhost:29511 -m src.train --config config/uplvsm_x224.yaml

# finetuning on 518×518 resolution
torchrun --nproc_per_node 8 --nnodes 1 --rdzv_id 18640 --rdzv_backend c10d --rdzv_endpoint localhost:29511 -m src.train --config config/uplvsm_x518.yaml

📄 Acknowledgments

Our implementation builds upon LVSM. We also recommend RayZer, Pensieve and X-Factor for self-supervised scene reconstruction.

If you find this work useful for your research, please consider citing:

@misc{wang2025less3depend,
    title={The Less You Depend, The More You Learn: Synthesizing Novel Views from Sparse, Unposed Images without Any 3D Knowledge}, 
    author={Haoru Wang and Kai Ye and Yangyan Li and Wenzheng Chen and Baoquan Chen},
    year={2025},
    eprint={2506.09885},
    archivePrefix={arXiv},
    primaryClass={cs.CV},
    url={https://arxiv.org/abs/2506.09885}, 
}

About

[ICLR 2026] PyTorch implementation of "The Less You Depend, The More You Learn: Synthesizing Novel Views from Sparse, Unposed Images without Any 3D Knowledge".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages