Skip to content

Commit e25227a

Browse files
committed
update challenge readme, clone release branch
1 parent 015efce commit e25227a

File tree

2 files changed

+29
-28
lines changed

2 files changed

+29
-28
lines changed

.pre-commit-config.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ repos:
2929
rev: v2.2.1
3030
hooks:
3131
- id: codespell
32+
args: ["--skip=*.md,*.txt,*.rst"]
3233
- repo: https://github.com/pre-commit/pre-commit-hooks
3334
rev: v3.1.0
3435
hooks:

challenge/README.md

Lines changed: 28 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
This track challenges participants to develop **multimodal navigation agents** that can interpret **natural language instructions** and operate within a **realistic physics-based simulation** environment.
66

7-
Participants will deploy their agents on a **legged humanoid robot** (e.g., **Unitree H1**) to perform complex indoor navigation tasks using **egocentric visual inputs** and **language commands**. Agents must not only understand instructions but also perceive the environment, model trajectory history, and predict navigation actions in real time.
7+
Participants will deploy their agents on a **legged humanoid robot** (e.g., **Unitree H1**) to perform complex indoor navigation tasks using **egocentric visual inputs** and **language commands**. Agents must not only understand instructions but also perceive the environment, model trajectory history, and predict navigation actions in real time.
88

99
The system should be capable of handling challenges such as camera shake, height variation, and local obstacle avoidance, ultimately achieving robust and safe vision-and-language navigation.
1010

@@ -28,13 +28,13 @@ This guide provides a step-by-step walkthrough for participating in the **IROS 2
2828

2929

3030
## 🔗 Useful Links
31-
- 🔍 **Challenge Overview:**
31+
- 🔍 **Challenge Overview:**
3232
[Challenge of Multimodal Robot Learning in InternUtopia and Real World](https://internrobotics.shlab.org.cn/challenge/2025/).
3333

34-
- 📖 **InternUtopia + InternNav Documentation:**
34+
- 📖 **InternUtopia + InternNav Documentation:**
3535
[Getting Started](https://internrobotics.github.io/user_guide/internutopia/get_started/index.html)
3636

37-
- 🚀 **Interactive Demo:**
37+
- 🚀 **Interactive Demo:**
3838
[InternNav Model Inference Demo](https://huggingface.co/spaces/InternRobotics/InternNav-Eval-Demo)
3939

4040

@@ -43,12 +43,12 @@ This guide provides a step-by-step walkthrough for participating in the **IROS 2
4343

4444
### Clone the InternNav repository to any desired location
4545
```bash
46-
$ git clone [email protected]:InternRobotics/InternNav.git
46+
$ git clone -b release/v0.1 [email protected]:InternRobotics/InternNav.git --recursive
4747
```
4848

4949
### Pull our base Docker image
5050
```bash
51-
$ docker pull crpi-mdum1jboc8276vb5.cn-beijing.personal.cr.aliyuncs.com/iros-challenge/internnav:v1.0
51+
$ docker pull crpi-mdum1jboc8276vb5.cn-beijing.personal.cr.aliyuncs.com/iros-challenge/internnav:v1.1
5252
```
5353

5454
### Run the container
@@ -96,8 +96,8 @@ $ git clone https://huggingface.co/datasets/spatialverse/InteriorAgent_Nav inter
9696
```
9797
Please refer to [document](https://internrobotics.github.io/user_guide/internnav/quick_start/installation.html#interndata-n1-dataset-preparation) for a full guide on InternData-N1 Dataset Preparation. In this challenge, we used test on the VLN-PE part of the [InternData-N1](https://huggingface.co/datasets/InternRobotics/InternData-N1) dataset. Optional: please feel free to download the full dataset to train your model.
9898

99-
- Download the [**IROS-2025-Challenge-Nav Dataset**](https://huggingface.co/datasets/InternRobotics/IROS-2025-Challenge-Nav/tree/main) for the `vln_pe/`,
100-
- Download the [SceneData-N1](https://huggingface.co/datasets/InternRobotics/Scene-N1/tree/main) for the `scene_data/`,
99+
- Download the [**IROS-2025-Challenge-Nav Dataset**](https://huggingface.co/datasets/InternRobotics/IROS-2025-Challenge-Nav/tree/main) for the `vln_pe/`,
100+
- Download the [SceneData-N1](https://huggingface.co/datasets/InternRobotics/Scene-N1/tree/main) for the `scene_data/`,
101101
- Download the [Embodiments](https://huggingface.co/datasets/InternRobotics/Embodiments) for the `Embodiments/`
102102

103103
```bash
@@ -114,7 +114,7 @@ $ git clone https://huggingface.co/datasets/InternRobotics/Embodiments data/Embo
114114
### Suggested Dataset Directory Structure
115115
#### InternData-N1
116116
```
117-
data/
117+
data/
118118
├── Embodiments/
119119
├── scene_data/
120120
│ └── mp3d_pe/
@@ -130,7 +130,7 @@ data/
130130
│ └── val_unseen/
131131
└── traj_data/ # training sample data for two types of scenes
132132
├── interiornav/
133-
│ └── kujiale_xxxx.tar.gz
133+
│ └── kujiale_xxxx.tar.gz
134134
└── r2r/
135135
└── trajectory_0/
136136
├── data/
@@ -140,10 +140,10 @@ data/
140140
#### Interior_data/
141141
```bash
142142
interiornav_data
143-
├── scene_data
143+
├── scene_data
144144
│ ├── kujiale_xxxx/
145145
│ └── ...
146-
└── raw_data
146+
└── raw_data
147147
├── train/
148148
├── val_seen/
149149
└── val_unseen/
@@ -166,7 +166,7 @@ $ git submodule update --init
166166

167167
## 🛠️ Model Training and Testing
168168

169-
Please refer to the [documentation](https://internrobotics.github.io/user_guide/internnav/quick_start/train_eval.html) for a quick-start guide to training or evaluating supported models in InternNav.
169+
Please refer to the [documentation](https://internrobotics.github.io/user_guide/internnav/quick_start/train_eval.html) for a quick-start guide to training or evaluating supported models in InternNav.
170170

171171
For advanced usage, including customizing datasets, models, and experimental settings, see the [tutorial](https://internrobotics.github.io/user_guide/internnav/tutorials/index.html).
172172

@@ -210,7 +210,7 @@ The main components include:
210210
- The evaluation process now can be viewed at `logs/`. Update `challenge_cfg.py` to get visualization output:
211211
- Set `eval_settings['vis_output']=True` to see saved frames and video during the evaluation trajectory
212212
- Set `env_settings['headless']=False` to open isaac-sim interactive window
213-
<img src="output.gif" alt="output" style="width:50%;">
213+
<img src="output.gif" alt="output" style="width:50%;">
214214

215215
### Create Your Model & Agent
216216
#### Custom Model
@@ -223,7 +223,7 @@ action = self.agent.step(obs)
223223
obs = [{
224224
'globalgps': [X, Y, Z] # robot location
225225
'globalrotation': [X, Y, Z, W] # robot orientation in quaternion
226-
'rgb': np.array(256, 256, 3) # rgb camera image
226+
'rgb': np.array(256, 256, 3) # rgb camera image
227227
'depth': np.array(256, 256, 1) # depth image
228228
}]
229229
```
@@ -237,7 +237,7 @@ action = List[int] # action for each environments
237237
```
238238
#### Create a Custom Config Class
239239
240-
In the model file, define a `Config` class that inherits from `PretrainedConfig`.
240+
In the model file, define a `Config` class that inherits from `PretrainedConfig`.
241241
A reference implementation is `CMAModelConfig` in [`cma_model.py`](../internnav/model/cma/cma_policy.py).
242242
243243
#### Registration and Integration
@@ -248,7 +248,7 @@ In [`internnav/model/__init__.py`](../internnav/model/__init__.py):
248248
249249
#### Create a Custom Agent
250250
251-
The Agent handles interaction with the environment, data preprocessing/postprocessing, and calls the Model for inference.
251+
The Agent handles interaction with the environment, data preprocessing/postprocessing, and calls the Model for inference.
252252
A custom Agent usually inherits from [`Agent`](../internnav/agent/base.py) and implements the following key methods:
253253
254254
- `reset()`: Resets the Agent's internal state (e.g., RNN states, action history). Called at the start of each episode.
@@ -259,7 +259,7 @@ Example: [`CMAAgent`](../internnav/agent/cma_agent.py)
259259
260260
#### Create a Trainer
261261
262-
The Trainer manages the training loop, including data loading, forward pass, loss calculation, and backpropagation.
262+
The Trainer manages the training loop, including data loading, forward pass, loss calculation, and backpropagation.
263263
A custom trainer usually inherits from the [`Base Trainer`](../internnav/trainer/base.py) and implements:
264264
265265
- `train_epoch()`: Runs one training epoch (batch iteration, forward pass, loss calculation, parameter update).
@@ -310,7 +310,7 @@ Main fields:
310310
- `model_name`: Must match the name used during training
311311
- `ckpt_to_load`: Path to the model checkpoint
312312
- `task`: Define the tasks settings, number of env, scene, robots
313-
- `dataset`: Load r2r or interiornav dataset
313+
- `dataset`: Load r2r or interiornav dataset
314314
- `split`: Dataset split (`val_seen`, `val_unseen`, `test`, etc.)
315315
316316
## 📦 Packaging and Submission
@@ -320,8 +320,8 @@ Main fields:
320320
Use this to evaluate your model on the validation split locally. The command is identical to what EvalAI runs, so it’s also a good sanity check before submitting.
321321
322322
- Make sure your trained weights and model code are correctly packaged in your submitted Docker image at `/root/InternNav`.
323-
- The evaluation configuration is properly set at: `scripts/eval/configs/challenge_cfg.py`.
324-
- No need to include the `data` directory in your submission.
323+
- The evaluation configuration is properly set at: `scripts/eval/configs/challenge_cfg.py`.
324+
- No need to include the `data` directory in your submission.
325325
```bash
326326
# Run local benchmark on the validation set
327327
$ bash challenge/start_eval_iros.sh --config scripts/eval/configs/challenge_cfg.py --split [val_seen/val_unseen]
@@ -338,7 +338,7 @@ $ cd PATH/TO/INTERNNAV/
338338
# Build the new image
339339
$ docker build -t my-internnav-custom:v1 .
340340
```
341-
Or commit your container as new image:
341+
Or commit your container as new image:
342342

343343
```bash
344344
$ docker commit internnav my-internnav-with-updates:v1
@@ -443,15 +443,15 @@ For detailed submission guidelines and troubleshooting, refer to the official Ev
443443
### 🧪 Simulation Environment
444444

445445
- **Platform**: Physics-driven simulation using [InternUtopia](https://github.com/InternRobotics/InternUtopia)
446-
- **Robot**: Unitree H1 humanoid robot model
447-
- **Tasks**: Instruction-based navigation in richly furnished indoor scenes
446+
- **Robot**: Unitree H1 humanoid robot model
447+
- **Tasks**: Instruction-based navigation in richly furnished indoor scenes
448448
- **Evaluation**: Based on success rate, path efficiency, and instruction compliance
449449

450450

451451

452452
### 🔍 Evaluation Metrics
453453

454-
- **Success Rate (SR)**: Proportion of episodes where the agent reaches the goal location within 3m
454+
- **Success Rate (SR)**: Proportion of episodes where the agent reaches the goal location within 3m
455455
- **SPL**: Success weighted by Path Length
456456
- **Trajectory Length (TL)**: Total length of the trajectory (m)
457457
- **Navigation Error (NE)**: Euclidean distance between the agent's final position and the goal (m)
@@ -463,8 +463,8 @@ For detailed submission guidelines and troubleshooting, refer to the official Ev
463463

464464
### 🚨 Challenges to Solve
465465

466-
- ✅ Integrating vision, language, and control into a single inference pipeline
467-
- ✅ Overcoming sensor instability and actuation delay from simulated humanoid locomotion
466+
- ✅ Integrating vision, language, and control into a single inference pipeline
467+
- ✅ Overcoming sensor instability and actuation delay from simulated humanoid locomotion
468468
- ✅ Ensuring real-time, smooth, and goal-directed behavior under physics constraints
469469

470470
This track pushes the boundary of embodied AI by combining **natural language understanding**, **3D vision**, and **realistic robot control**, fostering solutions ready for future real-world deployments.
@@ -487,4 +487,4 @@ For more details with in-depth physical analysis results on the VLN task, please
487487
- **Organizer**: Shanghai AI Lab
488488
- **Co-organizers**: ManyCore Tech, University of Adelaide
489489
- **Data Contributions**: Online test data provided by Prof. Qi Wu's team; Kujiale scenes provided by ManyCore Tech
490-
- **Sponsors** (in no particular order): ByteDance, HUAWEI, ENGINEAI, HONOR, ModelScope, Alibaba Cloud, AGILEX, DOBOT
490+
- **Sponsors** (in no particular order): ByteDance, HUAWEI, ENGINEAI, HONOR, ModelScope, Alibaba Cloud, AGILEX, DOBOT

0 commit comments

Comments
 (0)