You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: challenge/README.md
+28-28Lines changed: 28 additions & 28 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@
4
4
5
5
This track challenges participants to develop **multimodal navigation agents** that can interpret **natural language instructions** and operate within a **realistic physics-based simulation** environment.
6
6
7
-
Participants will deploy their agents on a **legged humanoid robot** (e.g., **Unitree H1**) to perform complex indoor navigation tasks using **egocentric visual inputs** and **language commands**. Agents must not only understand instructions but also perceive the environment, model trajectory history, and predict navigation actions in real time.
7
+
Participants will deploy their agents on a **legged humanoid robot** (e.g., **Unitree H1**) to perform complex indoor navigation tasks using **egocentric visual inputs** and **language commands**. Agents must not only understand instructions but also perceive the environment, model trajectory history, and predict navigation actions in real time.
8
8
9
9
The system should be capable of handling challenges such as camera shake, height variation, and local obstacle avoidance, ultimately achieving robust and safe vision-and-language navigation.
10
10
@@ -28,13 +28,13 @@ This guide provides a step-by-step walkthrough for participating in the **IROS 2
28
28
29
29
30
30
## 🔗 Useful Links
31
-
- 🔍 **Challenge Overview:**
31
+
- 🔍 **Challenge Overview:**
32
32
[Challenge of Multimodal Robot Learning in InternUtopia and Real World](https://internrobotics.shlab.org.cn/challenge/2025/).
@@ -96,8 +96,8 @@ $ git clone https://huggingface.co/datasets/spatialverse/InteriorAgent_Nav inter
96
96
```
97
97
Please refer to [document](https://internrobotics.github.io/user_guide/internnav/quick_start/installation.html#interndata-n1-dataset-preparation) for a full guide on InternData-N1 Dataset Preparation. In this challenge, we used test on the VLN-PE part of the [InternData-N1](https://huggingface.co/datasets/InternRobotics/InternData-N1) dataset. Optional: please feel free to download the full dataset to train your model.
98
98
99
-
- Download the [**IROS-2025-Challenge-Nav Dataset**](https://huggingface.co/datasets/InternRobotics/IROS-2025-Challenge-Nav/tree/main) for the `vln_pe/`,
100
-
- Download the [SceneData-N1](https://huggingface.co/datasets/InternRobotics/Scene-N1/tree/main) for the `scene_data/`,
99
+
- Download the [**IROS-2025-Challenge-Nav Dataset**](https://huggingface.co/datasets/InternRobotics/IROS-2025-Challenge-Nav/tree/main) for the `vln_pe/`,
100
+
- Download the [SceneData-N1](https://huggingface.co/datasets/InternRobotics/Scene-N1/tree/main) for the `scene_data/`,
101
101
- Download the [Embodiments](https://huggingface.co/datasets/InternRobotics/Embodiments) for the `Embodiments/`
└── traj_data/ # training sample data for two types of scenes
132
132
├── interiornav/
133
-
│ └── kujiale_xxxx.tar.gz
133
+
│ └── kujiale_xxxx.tar.gz
134
134
└── r2r/
135
135
└── trajectory_0/
136
136
├── data/
@@ -140,10 +140,10 @@ data/
140
140
#### Interior_data/
141
141
```bash
142
142
interiornav_data
143
-
├── scene_data
143
+
├── scene_data
144
144
│ ├── kujiale_xxxx/
145
145
│ └── ...
146
-
└── raw_data
146
+
└── raw_data
147
147
├── train/
148
148
├── val_seen/
149
149
└── val_unseen/
@@ -166,7 +166,7 @@ $ git submodule update --init
166
166
167
167
## 🛠️ Model Training and Testing
168
168
169
-
Please refer to the [documentation](https://internrobotics.github.io/user_guide/internnav/quick_start/train_eval.html) for a quick-start guide to training or evaluating supported models in InternNav.
169
+
Please refer to the [documentation](https://internrobotics.github.io/user_guide/internnav/quick_start/train_eval.html) for a quick-start guide to training or evaluating supported models in InternNav.
170
170
171
171
For advanced usage, including customizing datasets, models, and experimental settings, see the [tutorial](https://internrobotics.github.io/user_guide/internnav/tutorials/index.html).
172
172
@@ -210,7 +210,7 @@ The main components include:
210
210
- The evaluation process now can be viewed at `logs/`. Update `challenge_cfg.py` to get visualization output:
211
211
- Set `eval_settings['vis_output']=True` to see saved frames and video during the evaluation trajectory
212
212
- Set `env_settings['headless']=False` to open isaac-sim interactive window
Use this to evaluate your model on the validation split locally. The command is identical to what EvalAI runs, so it’s also a good sanity check before submitting.
321
321
322
322
- Make sure your trained weights and model code are correctly packaged in your submitted Docker image at `/root/InternNav`.
323
-
- The evaluation configuration is properly set at: `scripts/eval/configs/challenge_cfg.py`.
324
-
- No need to include the `data` directory in your submission.
323
+
- The evaluation configuration is properly set at: `scripts/eval/configs/challenge_cfg.py`.
324
+
- No need to include the `data` directory in your submission.
@@ -443,15 +443,15 @@ For detailed submission guidelines and troubleshooting, refer to the official Ev
443
443
### 🧪 Simulation Environment
444
444
445
445
-**Platform**: Physics-driven simulation using [InternUtopia](https://github.com/InternRobotics/InternUtopia)
446
-
-**Robot**: Unitree H1 humanoid robot model
447
-
-**Tasks**: Instruction-based navigation in richly furnished indoor scenes
446
+
-**Robot**: Unitree H1 humanoid robot model
447
+
-**Tasks**: Instruction-based navigation in richly furnished indoor scenes
448
448
-**Evaluation**: Based on success rate, path efficiency, and instruction compliance
449
449
450
450
451
451
452
452
### 🔍 Evaluation Metrics
453
453
454
-
-**Success Rate (SR)**: Proportion of episodes where the agent reaches the goal location within 3m
454
+
-**Success Rate (SR)**: Proportion of episodes where the agent reaches the goal location within 3m
455
455
-**SPL**: Success weighted by Path Length
456
456
-**Trajectory Length (TL)**: Total length of the trajectory (m)
457
457
-**Navigation Error (NE)**: Euclidean distance between the agent's final position and the goal (m)
@@ -463,8 +463,8 @@ For detailed submission guidelines and troubleshooting, refer to the official Ev
463
463
464
464
### 🚨 Challenges to Solve
465
465
466
-
- ✅ Integrating vision, language, and control into a single inference pipeline
467
-
- ✅ Overcoming sensor instability and actuation delay from simulated humanoid locomotion
466
+
- ✅ Integrating vision, language, and control into a single inference pipeline
467
+
- ✅ Overcoming sensor instability and actuation delay from simulated humanoid locomotion
468
468
- ✅ Ensuring real-time, smooth, and goal-directed behavior under physics constraints
469
469
470
470
This track pushes the boundary of embodied AI by combining **natural language understanding**, **3D vision**, and **realistic robot control**, fostering solutions ready for future real-world deployments.
@@ -487,4 +487,4 @@ For more details with in-depth physical analysis results on the VLN task, please
487
487
-**Organizer**: Shanghai AI Lab
488
488
-**Co-organizers**: ManyCore Tech, University of Adelaide
489
489
-**Data Contributions**: Online test data provided by Prof. Qi Wu's team; Kujiale scenes provided by ManyCore Tech
490
-
-**Sponsors** (in no particular order): ByteDance, HUAWEI, ENGINEAI, HONOR, ModelScope, Alibaba Cloud, AGILEX, DOBOT
490
+
-**Sponsors** (in no particular order): ByteDance, HUAWEI, ENGINEAI, HONOR, ModelScope, Alibaba Cloud, AGILEX, DOBOT
0 commit comments