Skip to content

chensiweiTHU/WHALES

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WHALES

A Multi-Agent Scheduling Dataset for Enhanced Cooperation in Autonomous Driving

arXiv

Introduction

WHALES (Wireless enHanced Autonomous vehicles with Large number of Engaged agentS) is a CARLA-based cooperative perception dataset averaging 8.4 agents per sequence. It captures diverse viewpoints, agent behaviors, and multitask interactions to study scheduling, perception, and planning under realistic multi-agent constraints.

News

  • 2025-11-21 – Released WHALES dataset v1.0 with cooperative scheduling benchmarks.
  • 2025-06-17 – WHALES was accepted by IROS 2025!

Table of Contents

Highlights

  • Largest agent count: 8.4 agents per scene with synchronized LiDAR-camera suites.
  • Rich annotations: 2.01M 3D boxes plus full agent-behavior recording.
  • Scheduling-ready: Provides perception, planning, and communication metadata for agent selection research.
  • Plug-in friendly: Ships with mmdetection3d-compatible configs and hooks for custom schedulers.

Dataset Overview

Comparison with Existing Benchmarks

Dataset Year Real/Simulated V2X Image Point Cloud 3D Annotations Classes Avg. Agents
KITTI 2012 Real No 15k 15k 200k 8 1
nuScenes 2019 Real No 1.4M 400k 1.4M 23 1
DAIR-V2X 2021 Real V2V&I 39k 39k 464k 10 2
V2X-Sim 2021 Simulated V2V&I 0 10k 26.6k 2 2
OPV2V 2022 Simulated V2V 44k 11k 230k 1 3
DOLPHINS 2022 Simulated V2V&I 42k 42k 293k 3 3
V2V4Real 2023 Real V2V 40k 20k 240k 5 2
WHALES (Ours) 2024 Simulated V2V&I 70k 17k 2.01M 3 8.4

Agent Types

Location Category Sensors Planning & Control Tasks Spawning
On-road Uncontrolled CAV LiDAR ×1 + Camera ×4 CARLA autopilot Perception Random / deterministic
On-road Controlled CAV LiDAR ×1 + Camera ×4 RL policy Perception & planning Random / deterministic
Roadside RSU LiDAR ×1 + Camera ×4 RL policy Perception & planning Static
Anywhere Obstacle agent CARLA autopilot Random

Getting Started

Installation

  1. Clone the repository:
    git clone https://github.com/chensiweiTHU/WHALES.git
  2. Create and activate a Conda environment:
    conda create -n whales python=3.10 -y
    conda activate whales
  3. Install WHALES:
    pip install -e .
  4. Install mmdetection3d==0.17.1 following the official guide.
  5. (Optional) Install OpenCOOD for additional cooperative baselines.

Data Preparation

  1. Download the full dataset from Google Drive: Download Whales.
  2. Place extracted files under ./data/whales/.
  3. Preprocess:
    python tools/create_data.py whales --root-path ./data/whales/ --out-dir ./data/whales/ --extra-tag whales
    This emits, under ./data/whales/:
    • whales_infos_{train,val}.pkl — LiDAR info PKLs for WhalesDataset.
    • whales_infos_{train,val}_mono3d.coco.json — per-camera mono3D COCO files for WhalesMonoDataset (cam-only training).
    • whales_dbinfos_train.pkl + whales_gt_database/ — GT-sampling database used by LiDAR configs' augmentation step.

Training & Evaluation

Configs are organised as:

  • ./configs/_base_/ — shared dataset, model, and schedule bases.
  • ./configs/standalone/ — single-agent baselines (PointPillars, SECOND, CenterPoint, FCOS3D, VoxelNeXt, etc. on LiDAR and monocular 3D).
  • ./configs/cooperative/ — V2X cooperative-perception recipes (PointPillars, VoxelNeXt, BEVFusion, FCooper, V2VNet, V2X-ViT, OPV2V, FFNet, plus the scheduling studies).

Pick any leaf config under those trees and run:

  • Training
    bash tools/dist_train.sh <config>.py <gpu_num>
  • Testing
    bash tools/dist_test.sh <config>.py <model>.pth <gpu_num> --eval bbox

Both WhalesDataset (LiDAR) and WhalesMonoDataset (monocular 3D) are registered; the COCO JSONs emitted by the preprocessing step drive the mono3D path, the info PKLs drive the LiDAR path. Metrics: mAP and NDS.

Visualization

WHALES visualization

tools/misc/visualize_whales.py can render from all three data representations (raw frame_info.json, info PKL, mono3D COCO):

# Raw CARLA frame_info.json: reconstruct ego-frame boxes + overlay on the 4 cameras + BEV.
python tools/misc/visualize_whales.py frame_info \
    --path data/whales/<scene>/<frame>/frame_info.json --agent vehicle0

# Single entry from the info PKL (one agent-frame).
python tools/misc/visualize_whales.py pkl \
    --path data/whales/whales_infos_val.pkl --token <scene>_<frame>_<agent>

# Batched renders: 2x2 camera grid alongside the BEV, one frame per scene.
python tools/misc/visualize_whales.py pkl_grid \
    --pkls data/whales/whales_infos_{train,val}.pkl \
    --num-per-pkl 20 --one-per-scene --out whales_vis/

# Mono3D COCO renders with 2D bbox + 3D wireframe per annotation.
python tools/misc/visualize_whales.py coco \
    --path data/whales/whales_infos_val_mono3d.coco.json --image-id <image_id>
python tools/misc/visualize_whales.py coco_batch \
    --path data/whales/whales_infos_val_mono3d.coco.json \
    --num-tokens 20 --one-per-scene --out whales_vis_coco/

Scheduling Algorithms

Agent scheduling pipelines live in ./mmdet3d_plugin/datasets/pipelines/cooperative_perception.py. CAHS prioritizes collaborators by historical coverage and predicted gains. CAHS overview

Experimental Results

All numbers below are reported as 50m / 100m, the two evaluation ranges used by the WHALES protocol (per-class radial distance from ego).

Stand-alone 3D Object Detection

Method AP_Veh ↑ AP_Ped ↑ AP_Cyc ↑ mAP ↑
PointPillars 67.1 / 41.5 38.0 / 6.3 37.3 / 11.6 47.5 / 19.8
SECOND 58.5 / 38.8 27.1 / 12.1 24.1 / 12.9 36.6 / 21.2
RegNet 66.9 / 42.3 38.7 / 8.4 32.9 / 11.7 46.2 / 20.8
VoxelNeXt 64.7 / 42.3 52.2 / 27.4 35.9 / 9.0 50.9 / 26.2

Cooperative 3D Object Detection

Method AP_Veh ↑ AP_Ped ↑ AP_Cyc ↑ mAP ↑
No Fusion 67.1 / 41.5 38.0 / 6.3 37.3 / 11.6 47.5 / 19.8
F-Cooper 75.4 / 52.8 50.1 / 9.1 44.7 / 20.4 56.8 / 27.4
Raw-level Fusion 71.3 / 48.9 38.1 / 8.5 40.7 / 16.3 50.0 / 24.6
VoxelNeXt 71.5 / 50.6 60.1 / 35.4 47.6 / 21.9 59.7 / 35.9

Scheduling Studies — Single-Agent Policies

mAP at 50m / 100m. Base detector: VoxelNeXt (LiDAR cooperative). Rows = inference-time policy, columns = training-time policy.

Inference \ Training No Fusion Closest First Single Random Multiple Random Full Communication
No Fusion (Baseline) 50.9 / 26.2 50.9 / 23.3 51.3 / 25.3 50.3 / 22.9 45.6 / 18.8
Closest First 39.9 / 20.3 58.4 / 30.2 58.3 / 32.6 57.3 / 30.5 55.4 / 10.8
Single Random 43.3 / 22.8 57.9 / 31.0 58.4 / 33.3 57.7 / 31.4 55.0 / 14.6
MASS 55.5 / 11.0 58.8 / 33.7 58.9 / 34.0 57.3 / 32.3 54.1 / 27.4
CAHS (Proposed) 56.1 / 29.6 62.5 / 31.7 62.7 / 35.9 58.3 / 32.6 59.9 / 31.0

Scheduling Studies — Multi-Agent Policies

mAP at 50m / 100m. Base detector: VoxelNeXt (LiDAR cooperative). Same axes as above.

Inference \ Training No Fusion Closest First Single Random Multiple Random Full Communication
Multiple Random 34.5 / 16.9 60.7 / 35.1 61.2 / 37.1 61.4 / 36.4 58.8 / 12.9
Full Communication 29.1 / 10.5 63.7 / 38.4 63.7 / 39.1 64.0 / 41.1 65.1 / 39.2
MASS 54.6 / 13.4 64.9 / 39.7 65.0 / 40.5 63.7 / 40.4 63.5 / 36.4
CAHS (Proposed) 53.7 / 14.2 65.3 / 40.1 65.1 / 42.0 63.9 / 40.6 65.2 / 39.2

Roadmap

  • Publish dataset and checkpoints on HuggingFace.

Citation

@INPROCEEDINGS{11247472,
  author    = {Wang, Yinsong Richard and Chen, Siwei and Song, Ziyi and Zhou, Sheng},
  title     = {{WHALES: A Multi-Agent Scheduling Dataset for Enhanced Cooperation in Autonomous Driving}},
  booktitle = {2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  year      = {2025},
  pages     = {20487-20493},
  keywords  = {Wireless communication; Three-dimensional displays; Scalability; Whales; Benchmark testing; Metadata; Scheduling; Vehicle dynamics; Vehicle-to-everything; Autonomous vehicles},
  doi       = {10.1109/IROS60139.2025.11247472}
}

About

This is the official repository of WHALES.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors