OTSurv: A Novel Multiple Instance Learning Framework for Survival Prediction with Heterogeneity-aware Optimal Transport
Qin Ren1 ★ Yifan Wang1 Ruogu Fang2 Haibin Ling1 Chenyu You1 ★
1 Stony Brook University
2 University of Florida
★ Corresponding authors
Welcome to the official repository of OTSurv, a novel framework that integrates Multiple Instance Learning (MIL) with Heterogeneity-aware Optimal Transport (OT) to tackle the challenges of survival prediction in medical imaging and clinical data.
📍 To be presented at MICCAI 2025
🧠 Focus: Survival Analysis · Multiple Instance Learning · Optimal Transport
OTSurv/
├── checkpoints/
│ ├── model_blca_fold0.pth
│ ├── model_blca_fold1.pth
│ └── ...
│
├── data/
│ ├── tcga_blca/
│ ├── tcga_brca/
│ ├── tcga_coadread/
│ ├── tcga_kirc/
│ ├── tcga_luad/
│ └── tcga_stad/
│
├── result/
│ ├── exp_otsurv_test/
│ ├── exp_otsurv_train/
│ └── visualization/
│
├── src/
│ ├── scripts/
│ ├── analysis/
│ └── ...
│
└── docs/
│ ├── OTSurv_main.png
│ └── OTSurv_heatmap.png
- H5 Format: Features are stored in
.h5files (directories ending withfeats_h5/)
For patch feature extraction, please refer to CLAM.
You can download the preprocessed features from this link.
- Python 3.8+
- GPU or CPU-only
- Conda package manager
# Clone the repository
git clone https://github.com/Y-Research-SBU/OTSurv.git
cd OTSurv
# Create conda environment
conda env create -f env.yaml
conda activate otsurv# Training results will be saved under result/exp_otsurv_train
cd src
# Train on all datasets
bash scripts/train_otsurv.sh
# Train on TCGA-BLCA dataset specifically
bash scripts/train_blca.shYou can download all trained checkpoints from this link.
# Test results will be saved under result/exp_otsurv_test
cd src
# Test on all datasets
bash scripts/test_otsurv.sh
# Test on TCGA-BLCA dataset specifically
bash scripts/test_blca.shcd src
# Calculate performance metrics
python analysis/calculate_CIndex_mean_std.py# Generated figures will be saved under result/visualization
cd src
# Generate survival curves
python analysis/plot_survival_curv.pyThe survival curve for TCGA-BLCA looks like this:
Below are the C-Index performance results of OTSurv across different cancer types:
| Cancer Type | Mean C-Index | Std Dev |
|---|---|---|
| BRCA | 0.621 | ±0.071 |
| BLCA | 0.637 | ±0.065 |
| LUAD | 0.638 | ±0.077 |
| STAD | 0.565 | ±0.057 |
| COADREAD | 0.667 | ±0.111 |
| KIRC | 0.750 | ±0.149 |
Overall Performance: Average C-Index across all datasets is 0.646
💡 Note: C-Index (Concordance Index) is a commonly used performance metric in survival analysis, where values closer to 1.0 indicate better prediction performance.
If you find this work useful, please cite our paper:
@misc{ren2025otsurvnovelmultipleinstance,
title={OTSurv: A Novel Multiple Instance Learning Framework for Survival Prediction with Heterogeneity-aware Optimal Transport},
author={Qin Ren and Yifan Wang and Ruogu Fang and Haibin Ling and Chenyu You},
year={2025},
eprint={2506.20741},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2506.20741},
}📝 Note: This paper has been accepted at MICCAI 2025. The citation details will be updated once the paper is officially published.
This work builds upon the excellent research from:
This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License - see the LICENSE.md file for details.
We welcome contributions to OTSurv! If you have suggestions, bug reports, or want to add features or experiments, feel free to:
- 🐞 Submit an issue
- 🔧 Open a pull request
- 💬 Start a discussion
⭐ If you find this repository helpful, please consider starring it! ⭐

