TencentARC · MrLemur · Jan 7, 2025 · Jan 7, 2025 · Jan 7, 2025 · Jan 7, 2025
diff --git a/.dockerignore b/.dockerignore
@@ -0,0 +1,44 @@
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+
+# C extensions
+*.so
+
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+eggs/
+.eggs/
+.vscode/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+
+.DS_Store
+
+tools/objaverse_rendering/blender-3.2.2-linux-x64/
+tools/objaverse_rendering/output/
+ckpts/
+lightning_logs/
+logs/
+.trash/
+.env/
+outputs/
+figures*/
+
+# Useless Files
+*.sh
+!docker/entrypoint.sh
+blender/
+.restore/
diff --git a/.gitignore b/.gitignore
@@ -39,5 +39,6 @@ figures*/
 
 # Useless Files
 *.sh
+!docker/entrypoint.sh
 blender/
 .restore/
diff --git a/README.md b/README.md
@@ -2,8 +2,8 @@
 
 # InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
 
-<a href="https://arxiv.org/abs/2404.07191"><img src="https://img.shields.io/badge/ArXiv-2404.07191-brightgreen"></a> 
-<a href="https://huggingface.co/TencentARC/InstantMesh"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Model_Card-Huggingface-orange"></a> 
+<a href="https://arxiv.org/abs/2404.07191"><img src="https://img.shields.io/badge/ArXiv-2404.07191-brightgreen"></a>
+<a href="https://huggingface.co/TencentARC/InstantMesh"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Model_Card-Huggingface-orange"></a>
 <a href="https://huggingface.co/spaces/TencentARC/InstantMesh"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Gradio%20Demo-Huggingface-orange"></a> <br>
 <a href="https://replicate.com/camenduru/instantmesh"><img src="https://img.shields.io/badge/Demo-Replicate-blue"></a>
 <a href="https://colab.research.google.com/github/camenduru/InstantMesh-jupyter/blob/main/InstantMesh_jupyter.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg"></a>
@@ -18,7 +18,8 @@ This repo is the official implementation of InstantMesh, a feed-forward framewor
 https://github.com/TencentARC/InstantMesh/assets/20635237/dab3511e-e7c6-4c0b-bab7-15772045c47d
 
 # 🚩 Features and Todo List
-- [x] 🔥🔥 Release Zero123++ fine-tuning code. 
+
+- [x] 🔥🔥 Release Zero123++ fine-tuning code.
 - [x] 🔥🔥 Support for running gradio demo on two GPUs to save memory.
 - [x] 🔥🔥 Support for running demo with docker. Please refer to the [docker](docker/) directory.
 - [x] Release inference and training code.
@@ -29,6 +30,7 @@ https://github.com/TencentARC/InstantMesh/assets/20635237/dab3511e-e7c6-4c0b-bab
 # ⚙️ Dependencies and Installation
 
 We recommend using `Python>=3.10`, `PyTorch>=2.1.0`, and `CUDA>=12.1`.
+
 ```bash
 conda create --name instantmesh python=3.10
 conda activate instantmesh
@@ -38,15 +40,15 @@ pip install -U pip
 conda install Ninja
 
 # Install the correct version of CUDA
-conda install cuda -c nvidia/label/cuda-12.1.0
+conda install cuda -c nvidia/label/cuda-12.4.0
+
+# Install requirements
+pip install -r requirements.txt
 
 # Install PyTorch and xformers
 # You may need to install another xformers version if you use a different PyTorch version
-pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121
-pip install xformers==0.0.22.post7
-
-# Install other requirements
-pip install -r requirements.txt
+pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 xformers==0.0.29.post1 --index-url https://download.pytorch.org/whl/cu124
+pip install accelerate==0.31.0
 ```
 
 # 💫 How to Use
@@ -62,11 +64,13 @@ By default, we use the `instant-mesh-large` reconstruction model variant.
 ## Start a local gradio demo
 
 To start a gradio demo in your local machine, simply run:
+
 ```bash
 python app.py
 ```
 
 If you have multiple GPUs in your machine, the demo app will run on two GPUs automatically to save memory. You can also force it to run on a single GPU:
+
 ```bash
 CUDA_VISIBLE_DEVICES=0 python app.py
 ```
@@ -76,31 +80,37 @@ Alternatively, you can run the demo with docker. Please follow the instructions
 ## Running with command line
 
 To generate 3D meshes from images via command line, simply run:
+
 ```bash
 python run.py configs/instant-mesh-large.yaml examples/hatsune_miku.png --save_video
 ```
 
 We use [rembg](https://github.com/danielgatis/rembg) to segment the foreground object. If the input image already has an alpha mask, please specify the `no_rembg` flag:
+
 ```bash
 python run.py configs/instant-mesh-large.yaml examples/hatsune_miku.png --save_video --no_rembg
 ```
 
 By default, our script exports a `.obj` mesh with vertex colors, please specify the `--export_texmap` flag if you hope to export a mesh with a texture map instead (this will cost longer time):
+
 ```bash
 python run.py configs/instant-mesh-large.yaml examples/hatsune_miku.png --save_video --export_texmap
 ```
 
 Please use a different `.yaml` config file in the [configs](./configs) directory if you hope to use other reconstruction model variants. For example, using the `instant-nerf-large` model for generation:
+
 ```bash
 python run.py configs/instant-nerf-large.yaml examples/hatsune_miku.png --save_video
 ```
+
 **Note:** When using the `NeRF` model variants for image-to-3D generation, exporting a mesh with texture map by specifying `--export_texmap` may cost long time in the UV unwarping step since the default iso-surface extraction resolution is `256`. You can set a lower iso-surface extraction resolution in the config file.
 
 # 💻 Training
 
 We provide our training code to facilitate future research. But we cannot provide the training dataset due to its size. Please refer to our [dataloader](src/data/objaverse.py) for more details.
 
 To train the sparse-view reconstruction models, please run:
+
 ```bash
 # Training on NeRF representation
 python train.py --base configs/instant-nerf-large-train.yaml --gpus 0,1,2,3,4,5,6,7 --num_nodes 1
@@ -110,6 +120,7 @@ python train.py --base configs/instant-mesh-large-train.yaml --gpus 0,1,2,3,4,5,
 ```
 
 We also provide our Zero123++ fine-tuning code since it is frequently requested. The running command is:
+
 ```bash
 python train.py --base configs/zero123plus-finetune.yaml --gpus 0,1,2,3,4,5,6,7 --num_nodes 1
 ```

diff --git a/app.py b/app.py
@@ -1,4 +1,5 @@
 import os
+import argparse
 import imageio
 import numpy as np
 import torch
@@ -66,14 +67,19 @@ def images_to_video(images, output_path, fps=30):
 
 
 ###############################################################################
-# Configuration.
+# Arguments.
 ###############################################################################
 
-seed_everything(0)
+parser = argparse.ArgumentParser()
+parser.add_argument('config', nargs='?', type=str, help='Path to config file.', default='configs/instant-mesh-large.yaml')
+args = parser.parse_args()
 
-config_path = 'configs/instant-mesh-large.yaml'
-config = OmegaConf.load(config_path)
-config_name = os.path.basename(config_path).replace('.yaml', '')
+###############################################################################
+# Configuration.
+###############################################################################
+seed_everything(0)
+config = OmegaConf.load(args.config)
+config_name = os.path.basename(args.config).replace('.yaml', '')
 model_config = config.model_config
 infer_config = config.infer_config
 
@@ -94,16 +100,22 @@ def images_to_video(images, output_path, fps=30):
 )
 
 # load custom white-background UNet
-unet_ckpt_path = hf_hub_download(repo_id="TencentARC/InstantMesh", filename="diffusion_pytorch_model.bin", repo_type="model", cache_dir=model_cache_dir)
+print('Loading custom white-background unet ...')
+if os.path.exists(infer_config.unet_path):
+    unet_ckpt_path = infer_config.unet_path
+else:
+    unet_ckpt_path = hf_hub_download(repo_id="TencentARC/InstantMesh", filename="diffusion_pytorch_model.bin", repo_type="model", cache_dir=model_cache_dir)
 state_dict = torch.load(unet_ckpt_path, map_location='cpu')
 pipeline.unet.load_state_dict(state_dict, strict=True)
-
 pipeline = pipeline.to(device0)
 
 # load reconstruction model
 print('Loading reconstruction model ...')
-model_ckpt_path = hf_hub_download(repo_id="TencentARC/InstantMesh", filename="instant_mesh_large.ckpt", repo_type="model", cache_dir=model_cache_dir)
 model = instantiate_from_config(model_config)
+if os.path.exists(infer_config.model_path):
+    model_ckpt_path = infer_config.model_path
+else:
+    model_ckpt_path = hf_hub_download(repo_id="TencentARC/InstantMesh", filename=f"{config_name.replace('-', '_')}.ckpt", repo_type="model", cache_dir=model_cache_dir)
 state_dict = torch.load(model_ckpt_path, map_location='cpu')['state_dict']
 state_dict = {k[14:]: v for k, v in state_dict.items() if k.startswith('lrm_generator.') and 'source_camera' not in k}
 model.load_state_dict(state_dict, strict=True)

diff --git a/docker/Dockerfile b/docker/Dockerfile
@@ -1,57 +1,30 @@
-# get the development image from nvidia cuda 12.1
-FROM nvidia/cuda:12.4.1-runtime-ubuntu22.04
+FROM nvidia/cuda:12.4.1-devel-ubuntu22.04
 
 LABEL name="instantmesh" maintainer="instantmesh"
 
-# Add a volume for downloaded models
-VOLUME /workspace/models
-
-# create workspace folder and set it as working directory
 RUN mkdir -p /workspace/instantmesh
 WORKDIR /workspace
 
-# Set the timezone
 ENV DEBIAN_FRONTEND=noninteractive
 RUN apt-get update && \
-    apt-get install -y tzdata && \
+    apt-get install -y python3.10 python3-pip tzdata build-essential git wget vim libegl1-mesa-dev libglib2.0-0 unzip && \
     ln -fs /usr/share/zoneinfo/America/Chicago /etc/localtime && \ 
-    dpkg-reconfigure --frontend noninteractive tzdata
-
-# update package lists and install git, wget, vim, libegl1-mesa-dev, and libglib2.0-0
-RUN apt-get update && \
-    apt-get install -y build-essential git wget vim libegl1-mesa-dev libglib2.0-0 unzip
-
-# install conda
-RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh && \
-    chmod +x Miniconda3-latest-Linux-x86_64.sh && \
-    ./Miniconda3-latest-Linux-x86_64.sh -b -p /workspace/miniconda3 && \
-    rm Miniconda3-latest-Linux-x86_64.sh
-
-# update PATH environment variable
-ENV PATH="/workspace/miniconda3/bin:${PATH}"
+    dpkg-reconfigure --frontend noninteractive tzdata && \
+    apt-get clean && \
+    rm -rf /var/lib/apt/lists/*
 
-# initialize conda
-RUN conda init bash
+RUN pip install --no-cache-dir torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 xformers==0.0.29.post1 --index-url https://download.pytorch.org/whl/cu124
+RUN pip install --no-cache-dir triton ninja
 
-# create and activate conda environment
-RUN conda create -n instantmesh python=3.10 && echo "source activate instantmesh" > ~/.bashrc
-ENV PATH /workspace/miniconda3/envs/instantmesh/bin:$PATH
-
-RUN conda install Ninja
-RUN conda install cuda -c nvidia/label/cuda-12.4.1 -y
-
-RUN pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121
-RUN pip install xformers==0.0.22.post7
-RUN pip install triton
-
-# change the working directory to the repository
 WORKDIR /workspace/instantmesh
 
-# other dependencies
 ADD ./requirements.txt /workspace/instantmesh/requirements.txt
-RUN pip install -r requirements.txt
+RUN pip install --no-cache-dir -r requirements.txt
 
 COPY . /workspace/instantmesh
 
-# Run the command when the container starts
-CMD ["python", "app.py"]
+RUN chmod +x /workspace/instantmesh/docker/entrypoint.sh
+
+ENV CONFIG="instant-mesh-large"
+
+CMD ["sh", "-c", "/workspace/instantmesh/docker/entrypoint.sh"]
diff --git a/docker/README.md b/docker/README.md
@@ -16,13 +16,19 @@ Run docker image with a local model cache (so it is fast when container is start
 mkdir -p $HOME/models/
 export MODEL_DIR=$HOME/models/
 
-docker run -it -p 43839:43839 --platform=linux/amd64 --gpus all -v $MODEL_DIR:/workspace/instantmesh/models instantmesh
+docker run -it -p 43839:43839 --platform=linux/amd64 --gpus all -v $MODEL_DIR:/workspace/instantmesh/ckpts/models instantmesh
 ```
 
 To use specific GPUs:
 
 ```bash
-docker run -it -p 43839:43839 --platform=linux/amd64 --gpus '"device=0,1"' -v $MODEL_DIR:/workspace/instantmesh/models instantmesh
+docker run -it -p 43839:43839 --platform=linux/amd64 --gpus '"device=0,1"' -v $MODEL_DIR:/workspace/instantmesh/ckpts instantmesh
+```
+
+To use a different model file:
+
+```bash
+docker run -it -p 43839:43839 --platform=linux/amd64 --gpus all -v $MODEL_DIR:/workspace/instantmesh/ckpts -e CONFIG=instant-mesh-base instantmesh
 ```
 
 Navigate to `http://localhost:43839` to use the demo.
diff --git a/docker/entrypoint.sh b/docker/entrypoint.sh
@@ -0,0 +1,2 @@
+cd /workspace/instantmesh
+python3 app.py configs/$CONFIG.yaml
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,2 @@
		cd /workspace/instantmesh
		python3 app.py configs/$CONFIG.yaml