This repository supports the research study "An Interpretable AI System for Stratifying High-risk Breast Lesions to Reduce False-positive MRI Diagnoses". For inquiries regarding this work, please feel free to contact our team.
- Reduces unnecessary biopsies by improving specificity (89.9% vs radiologists' 49.1%, p=0.014)
- Decreases inter-reader variability by 24.5% across experience levels
- Lowers false-positive rates by 27.3% in multicenter validation
- Provides refined risk stratification of BI-RADS 4 lesions into clinically actionable 4A/4B/4C subcategories
β Advanced Architecture
Foundation model leveraging multiphase DCE-MRI spatiotemporal dynamics
β Rigorous Validation
- Trained on 2,803 lesions from 2,686 patients
- AUC 0.893-0.930 across external & prospective tests
- Outperformed radiologists in NPV (92.1% vs 84.3%)
β Clinical Integration
- Interpretable Grad-CAM visualizations
- Compatible with standard PACS workflows
- Improves both senior and junior radiologists' accuracy
- Python 3.9+ (tested with Python 3.9.18)
- CUDA-enabled GPU (recommended)
- Git
Option 1: Using Conda with environment.yml (Recommended)
# Clone the repository
git clone https://github.com/zhenweishi/BL4AS.git
cd BL4AS
# Create and activate conda environment from file
conda env create -f environment.yml
conda activate bl4asOption 2: Using Conda with manual setup
# Clone the repository
git clone https://github.com/zhenweishi/BL4AS.git
cd BL4AS
# Create and activate conda environment
conda create -n bl4as python==3.9.18
conda activate bl4as
# Install dependencies
pip install -r requirements.txtOption 3: Using pip with custom Python path
# Clone the repository
git clone https://github.com/zhenweishi/BL4AS.git
cd BL4ASBefore running any models, you need to preprocess your data:
# Navigate to data preprocessing directory
cd examples/data
# Run preprocessing pipeline
conda activate bl4as
python preprocess.pyThe preprocessing pipeline performs three main steps:
- Enhancement Map Generation: Creates subtraction images (C2-C0, C5-C2) to highlight contrast enhancement patterns
- ROI Extraction: Uses connected component filtering (threshold: 15 pixels) and bounding box detection to extract tumor regions
- Output Generation: Creates ROI-extracted images with standardized resizing (default: 48Γ48Γ48) and generates data mapping files
Required Data Structure: Your data should be organized as follows:
examples/data/
βββ C0/ # Pre-contrast phase
β βββ image/ # Original 3D medical images (.nii.gz)
β βββ tumor_mask/ # Tumor segmentation masks
βββ C2/ # Peak enhancement phase
β βββ image/
β βββ tumor_mask/
βββ C5/ # Delayed enhancement phase
β βββ image/
β βββ tumor_mask/
βββ table.csv # Patient metadata (ID, is_malignant, filename)
βββ seg_demo.json # Segmentation task configuration
βββ preprocess.py # Main preprocessing script
After preprocessing, additional directories are created:
βββ C2-C0/ # Peak enhancement maps (C2 minus C0)
βββ C5-C2/ # Washout maps (C5 minus C2)
βββ */image@all_roi_resize@cc15/ # ROI-extracted images
βββ */tumor_mask@all_roi_resize@cc15/ # ROI-extracted masks
βββ cls_demo.csv # Generated classification data mapping
For detailed preprocessing instructions, see: examples/data/README.md
For studies involving data from different imaging centers, BL4AS includes robust preprocessing steps:
- Scanner Independence: Uses relative intensity differences (subtraction images) rather than absolute values
- ROI Standardization: Connected component filtering (removes regions <15 pixels) and consistent ROI extraction methods
- Cross-Center Compatibility: All ROI regions resized to consistent 48Γ48Γ48 dimensions
Test Segmentation Performance
conda activate bl4as
python -u main.py examples/configs/seg_test.yamlGenerated Output Structure:
runs/seg_test/
βββ test_seg/ # Predicted segmentation masks
β βββ P1.nii.gz # Patient 1 predicted tumor mask
β βββ P2.nii.gz # Patient 2 predicted tumor mask
β βββ P3.nii.gz # Patient 3 predicted tumor mask
β βββ P4.nii.gz # Patient 4 predicted tumor mask
β βββ P5.nii.gz # Patient 5 predicted tumor mask
βββ test_results.json # Quantitative evaluation metrics
βββ cfgs/ # Configuration file backups
βββ logs/ # Training/testing logs
βββ ckpts/ # Model checkpoints
Performance Metrics (logged in test_results.json):
- Dice coefficient: ~0.967, IoU: ~0.937, HD: ~12.9, Sensitivity: ~0.970, Precision: ~0.966
Train Custom Segmentation Model
python -u main.py examples/configs/seg_train.yamlGenerated Training Output:
runs/seg_train/
βββ ckpts/ # Model checkpoints
β βββ best_model.pth.tar # Best performing model
β βββ checkpoint_0000.pth.tar # Latest checkpoint
βββ test_results.json # Final evaluation metrics
βββ cfgs/ # Configuration file backups
βββ logs/ # TensorBoard training logs
βββ [timestamp]_[hostname]/ # Training progress visualization
Test Classification Performance
conda activate bl4as
python -u main.py examples/configs/cls_test.yamlGenerated Output Structure:
runs/cls_test/
βββ test_metrics.csv # 5-fold cross-validation results
βββ pkl/ # Detailed predictions and metadata
β βββ output_*.pkl # Model predictions for each fold
β βββ target_*.pkl # Ground truth labels for each fold
β βββ filename_*.pkl # Patient filenames for each fold
β βββ center_*.pkl # Center information for each fold
βββ cfgs/ # Configuration file backups
βββ logs/ # Training/testing logs with TensorBoard events
βββ ckpts/ # Model checkpoints
Performance Metrics (saved in test_metrics.csv):
- 5-fold cross-validation with AUROC: 1.000, Accuracy: 1.000, Sensitivity: 1.000, Specificity: 1.000, F1-Score: 1.000
Train Custom Classification Model
python -u main.py examples/configs/cls_train.yamlGenerated Training Output:
runs/cls_train/
βββ ckpts/ # Model checkpoints per fold
β βββ best_fold0.pth.tar # Best model for fold 0
β βββ best_fold1.pth.tar # Best model for fold 1
β βββ best_fold2.pth.tar # Best model for fold 2
β βββ best_fold3.pth.tar # Best model for fold 3
βββ cfgs/ # Configuration file backups
βββ logs/ # TensorBoard training logs
βββ [timestamp]_[hostname]/ # Training progress per fold
BL4AS uses MONAI's powerful configuration system. All training and testing parameters are controlled via YAML files in examples/configs/:
seg_train.yaml/seg_test.yaml: Segmentation task configurationscls_train.yaml/cls_test.yaml: Classification task configurations
For detailed configuration explanations, see: examples/configs/README.md
Segmentation Output Files:
test_seg/*.nii.gz: 3D predicted tumor masks in NIfTI format, ready for clinical visualizationtest_results.json: Comprehensive evaluation metrics including Dice, IoU, Hausdorff Distance, PPV, SENlogs/: TensorBoard events for visualization of training/testing progress
Classification Output Files:
test_metrics.csv: Cross-validation summary with AUC, accuracy, sensitivity, specificity per foldpkl/output_*.pkl: Raw model predictions (probabilities) for detailed analysispkl/target_*.pkl: Ground truth labels for validationpkl/filename_*.pkl: Patient identifiers for traceabilitycfgs/: Automatically saved configuration files for reproducibility
For Training Tasks: Checkpoint files (best_model.pth.tar, best_fold*.pth.tar) are saved in runs/*/ckpts/ for model deployment
For detailed analysis of classification results, use the provided demo script:
conda activate bl4as
python scripts/demo.pyGenerated Analysis Files:
classification_detailed_results.csv: Complete results with fold informationFold,ID,Label,Probability 0,P1,0,0.1526 0,P2,0,0.0957 1,P1,0,0.1526 ...
Script Output:
- Per-fold performance metrics (AUC, Accuracy)
- Overall statistics (patient counts, probabilities)
- Comprehensive performance evaluation across all metrics
Common Issues:
- File not found errors: Ensure preprocessing has been completed
- CUDA memory errors: Reduce batch size in configuration files
- Import errors: Verify all dependencies are installed correctly
- Configuration errors: Check YAML syntax and parameter references
The examples/ directory contains comprehensive guides:
-
examples/configs/README.md: Detailed configuration system documentation- MONAI configuration syntax and features
- Multi-fold cross-validation setup
- Parameter tuning guidelines
- Dynamic object instantiation examples
-
examples/data/README.md: Complete preprocessing pipeline guide- Multi-phase contrast imaging data structure
- ROI extraction methodologies
- Enhancement map generation
- Multi-center preprocessing considerations
- Dr. Zhenwei Shi 1, 2
- MSc. Zhitao Wei 1, 2
- MD. Yanting Liang 1, 2
- Dr. Chu Han 1, 2
- MD. Changhong Liang 1, 2
- MD. Zaiyi Liu 1, 2
1 Department of Radiology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, China
2 Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, China
π§ Full code release pending publication under review
π§ For collaboration inquiries: Contact Email

