Skip to content

zhenweishi/BL4AS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

A BI-RADS 4 Lesions Analysis System (BL4AS)

This repository supports the research study "An Interpretable AI System for Stratifying High-risk Breast Lesions to Reduce False-positive MRI Diagnoses". For inquiries regarding this work, please feel free to contact our team.

Clinical Impact

  • Reduces unnecessary biopsies by improving specificity (89.9% vs radiologists' 49.1%, p=0.014)
  • Decreases inter-reader variability by 24.5% across experience levels
  • Lowers false-positive rates by 27.3% in multicenter validation
  • Provides refined risk stratification of BI-RADS 4 lesions into clinically actionable 4A/4B/4C subcategories

Key Features

βœ” Advanced Architecture
Foundation model leveraging multiphase DCE-MRI spatiotemporal dynamics

βœ” Rigorous Validation

  • Trained on 2,803 lesions from 2,686 patients
  • AUC 0.893-0.930 across external & prospective tests
  • Outperformed radiologists in NPV (92.1% vs 84.3%)

βœ” Clinical Integration

  • Interpretable Grad-CAM visualizations
  • Compatible with standard PACS workflows
  • Improves both senior and junior radiologists' accuracy

Usage

πŸ”§ Environment Setup

Prerequisites

  • Python 3.9+ (tested with Python 3.9.18)
  • CUDA-enabled GPU (recommended)
  • Git

Installation Steps

Option 1: Using Conda with environment.yml (Recommended)

# Clone the repository
git clone https://github.com/zhenweishi/BL4AS.git
cd BL4AS

# Create and activate conda environment from file
conda env create -f environment.yml
conda activate bl4as

Option 2: Using Conda with manual setup

# Clone the repository
git clone https://github.com/zhenweishi/BL4AS.git
cd BL4AS

# Create and activate conda environment
conda create -n bl4as python==3.9.18
conda activate bl4as

# Install dependencies
pip install -r requirements.txt

Option 3: Using pip with custom Python path

# Clone the repository
git clone https://github.com/zhenweishi/BL4AS.git
cd BL4AS

πŸ“Š Data Preparation

Before running any models, you need to preprocess your data:

# Navigate to data preprocessing directory
cd examples/data

# Run preprocessing pipeline
conda activate bl4as
python preprocess.py

The preprocessing pipeline performs three main steps:

  1. Enhancement Map Generation: Creates subtraction images (C2-C0, C5-C2) to highlight contrast enhancement patterns
  2. ROI Extraction: Uses connected component filtering (threshold: 15 pixels) and bounding box detection to extract tumor regions
  3. Output Generation: Creates ROI-extracted images with standardized resizing (default: 48Γ—48Γ—48) and generates data mapping files

Required Data Structure: Your data should be organized as follows:

examples/data/
β”œβ”€β”€ C0/                    # Pre-contrast phase
β”‚   β”œβ”€β”€ image/            # Original 3D medical images (.nii.gz)
β”‚   └── tumor_mask/       # Tumor segmentation masks
β”œβ”€β”€ C2/                    # Peak enhancement phase
β”‚   β”œβ”€β”€ image/
β”‚   └── tumor_mask/
β”œβ”€β”€ C5/                    # Delayed enhancement phase  
β”‚   β”œβ”€β”€ image/
β”‚   └── tumor_mask/
β”œβ”€β”€ table.csv             # Patient metadata (ID, is_malignant, filename)
β”œβ”€β”€ seg_demo.json         # Segmentation task configuration
└── preprocess.py         # Main preprocessing script

After preprocessing, additional directories are created:

β”œβ”€β”€ C2-C0/                # Peak enhancement maps (C2 minus C0)
β”œβ”€β”€ C5-C2/                # Washout maps (C5 minus C2)  
β”œβ”€β”€ */image@all_roi_resize@cc15/     # ROI-extracted images
β”œβ”€β”€ */tumor_mask@all_roi_resize@cc15/ # ROI-extracted masks
└── cls_demo.csv          # Generated classification data mapping

For detailed preprocessing instructions, see: examples/data/README.md

πŸ₯ Multi-Center Data Preprocessing

For studies involving data from different imaging centers, BL4AS includes robust preprocessing steps:

  1. Scanner Independence: Uses relative intensity differences (subtraction images) rather than absolute values
  2. ROI Standardization: Connected component filtering (removes regions <15 pixels) and consistent ROI extraction methods
  3. Cross-Center Compatibility: All ROI regions resized to consistent 48Γ—48Γ—48 dimensions

🎯 Lesion Segmentation

Test Segmentation Performance

conda activate bl4as
python -u main.py examples/configs/seg_test.yaml

Generated Output Structure:

runs/seg_test/
β”œβ”€β”€ test_seg/           # Predicted segmentation masks
β”‚   β”œβ”€β”€ P1.nii.gz      # Patient 1 predicted tumor mask
β”‚   β”œβ”€β”€ P2.nii.gz      # Patient 2 predicted tumor mask
β”‚   β”œβ”€β”€ P3.nii.gz      # Patient 3 predicted tumor mask
β”‚   β”œβ”€β”€ P4.nii.gz      # Patient 4 predicted tumor mask
β”‚   └── P5.nii.gz      # Patient 5 predicted tumor mask
β”œβ”€β”€ test_results.json   # Quantitative evaluation metrics
β”œβ”€β”€ cfgs/              # Configuration file backups
β”œβ”€β”€ logs/              # Training/testing logs
└── ckpts/             # Model checkpoints

Performance Metrics (logged in test_results.json):

  • Dice coefficient: ~0.967, IoU: ~0.937, HD: ~12.9, Sensitivity: ~0.970, Precision: ~0.966

Train Custom Segmentation Model

python -u main.py examples/configs/seg_train.yaml

Generated Training Output:

runs/seg_train/
β”œβ”€β”€ ckpts/                  # Model checkpoints
β”‚   β”œβ”€β”€ best_model.pth.tar  # Best performing model
β”‚   └── checkpoint_0000.pth.tar  # Latest checkpoint
β”œβ”€β”€ test_results.json       # Final evaluation metrics
β”œβ”€β”€ cfgs/                   # Configuration file backups
└── logs/                   # TensorBoard training logs
    └── [timestamp]_[hostname]/  # Training progress visualization

πŸ”¬ BI-RADS 4 Lesion Classification

Test Classification Performance

conda activate bl4as
python -u main.py examples/configs/cls_test.yaml

Generated Output Structure:

runs/cls_test/
β”œβ”€β”€ test_metrics.csv    # 5-fold cross-validation results
β”œβ”€β”€ pkl/               # Detailed predictions and metadata
β”‚   β”œβ”€β”€ output_*.pkl   # Model predictions for each fold
β”‚   β”œβ”€β”€ target_*.pkl   # Ground truth labels for each fold
β”‚   β”œβ”€β”€ filename_*.pkl # Patient filenames for each fold
β”‚   └── center_*.pkl   # Center information for each fold
β”œβ”€β”€ cfgs/              # Configuration file backups
β”œβ”€β”€ logs/              # Training/testing logs with TensorBoard events
└── ckpts/             # Model checkpoints

Performance Metrics (saved in test_metrics.csv):

  • 5-fold cross-validation with AUROC: 1.000, Accuracy: 1.000, Sensitivity: 1.000, Specificity: 1.000, F1-Score: 1.000

Train Custom Classification Model

python -u main.py examples/configs/cls_train.yaml

Generated Training Output:

runs/cls_train/
β”œβ”€β”€ ckpts/                  # Model checkpoints per fold
β”‚   β”œβ”€β”€ best_fold0.pth.tar  # Best model for fold 0
β”‚   β”œβ”€β”€ best_fold1.pth.tar  # Best model for fold 1
β”‚   β”œβ”€β”€ best_fold2.pth.tar  # Best model for fold 2
β”‚   └── best_fold3.pth.tar  # Best model for fold 3
β”œβ”€β”€ cfgs/                   # Configuration file backups
└── logs/                   # TensorBoard training logs
    └── [timestamp]_[hostname]/  # Training progress per fold

βš™οΈ Configuration System

BL4AS uses MONAI's powerful configuration system. All training and testing parameters are controlled via YAML files in examples/configs/:

  • seg_train.yaml / seg_test.yaml: Segmentation task configurations
  • cls_train.yaml / cls_test.yaml: Classification task configurations

For detailed configuration explanations, see: examples/configs/README.md

πŸ“ˆ Understanding Model Outputs

Segmentation Output Files:

  • test_seg/*.nii.gz: 3D predicted tumor masks in NIfTI format, ready for clinical visualization
  • test_results.json: Comprehensive evaluation metrics including Dice, IoU, Hausdorff Distance, PPV, SEN
  • logs/: TensorBoard events for visualization of training/testing progress

Classification Output Files:

  • test_metrics.csv: Cross-validation summary with AUC, accuracy, sensitivity, specificity per fold
  • pkl/output_*.pkl: Raw model predictions (probabilities) for detailed analysis
  • pkl/target_*.pkl: Ground truth labels for validation
  • pkl/filename_*.pkl: Patient identifiers for traceability
  • cfgs/: Automatically saved configuration files for reproducibility

For Training Tasks: Checkpoint files (best_model.pth.tar, best_fold*.pth.tar) are saved in runs/*/ckpts/ for model deployment

πŸ“Š Analyzing Results with Demo Scripts

For detailed analysis of classification results, use the provided demo script:

conda activate bl4as
python scripts/demo.py

Generated Analysis Files:

  • classification_detailed_results.csv: Complete results with fold information
    Fold,ID,Label,Probability
    0,P1,0,0.1526
    0,P2,0,0.0957
    1,P1,0,0.1526
    ...

Script Output:

  • Per-fold performance metrics (AUC, Accuracy)
  • Overall statistics (patient counts, probabilities)
  • Comprehensive performance evaluation across all metrics

🚨 Troubleshooting

Common Issues:

  1. File not found errors: Ensure preprocessing has been completed
  2. CUDA memory errors: Reduce batch size in configuration files
  3. Import errors: Verify all dependencies are installed correctly
  4. Configuration errors: Check YAML syntax and parameter references

πŸ“š Additional Documentation

The examples/ directory contains comprehensive guides:

  • examples/configs/README.md: Detailed configuration system documentation

    • MONAI configuration syntax and features
    • Multi-fold cross-validation setup
    • Parameter tuning guidelines
    • Dynamic object instantiation examples
  • examples/data/README.md: Complete preprocessing pipeline guide

    • Multi-phase contrast imaging data structure
    • ROI extraction methodologies
    • Enhancement map generation
    • Multi-center preprocessing considerations

Main Developers

1 Department of Radiology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, China
2 Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, China

Contact

🚧 Full code release pending publication under review
πŸ“§ For collaboration inquiries: Contact Email

About

A BI-RADS 4 Lesions Analysis System

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages