Skip to content

Restuccia-Group/CredalFusion

Repository files navigation

Credal Fusion Experiments

This repository trains ensemble-based predictive credal sets and evaluates credal fusion strategies on multi-view image classification benchmarks.

The practical workflow is:

  1. Install dependencies.
  2. Pick a dataset config.
  3. Train checkpoints.
  4. Point the config at those checkpoints.
  5. Run the evaluation script.
  6. Read the metrics and diagnostic artifacts written to outputs/.

Repository Layout

configs/                  Dataset-specific experiment configs
scripts/train_ensemble.py Train deterministic ensembles or CreBNN prior sets
scripts/run_experiment.py Main multi-view fusion evaluation
scripts/eval_clean.py     Clean-set sanity check for deterministic ensembles
scripts/example_crebnn.py Export a predictive credal set tensor from a CreBNN model
src/credal/               Credal set, fusion, and uncertainty code
src/evaluation/           Metrics and diagnostic exports
checkpoints/              Saved trained models
outputs/                  Experiment results and plots

Setup

Install the Python dependencies:

pip install -r requirements.txt

The code uses PyTorch and will run on GPU if CUDA is available, otherwise on CPU.

Configs

The main configs are:

  • configs/default.yaml: CIFAR-10 deterministic ensemble workflow
  • configs/cifar100.yaml: CIFAR-100 deterministic ensemble workflow
  • configs/svhn.yaml: SVHN deterministic ensemble workflow
  • configs/crebnn_svhn.yaml: SVHN CreBNN workflow

Each config controls:

  • dataset and dataloader settings
  • view generation for the multi-view test setup
  • model type and checkpoint directory
  • credal representation and fusion settings
  • number of experiment runs and output location

Before evaluation, make sure model.checkpoint_dir in the selected config points to the checkpoint directory you want to use. run_experiment.py loads checkpoints from the config file; it does not take a separate checkpoint path argument.

Training

1. Train a deterministic ensemble

CredRO is the default deterministic training method:

python scripts/train_ensemble.py --config configs/default.yaml --method credro

Standard deep ensemble baseline:

python scripts/train_ensemble.py --config configs/default.yaml --method standard

The same command pattern works for the other deterministic configs:

python scripts/train_ensemble.py --config configs/cifar100.yaml --method credro
python scripts/train_ensemble.py --config configs/svhn.yaml --method credro

Useful overrides:

python scripts/train_ensemble.py \
  --config configs/default.yaml \
  --method credro \
  --n_members 5 \
  --epochs 200 \
  --delta_G 0.5 \
  --architectures resnet20,resnet32,resnet56

Training writes checkpoints into:

  • the directory passed with --output_dir, or
  • model.checkpoint_dir/<method> when --output_dir is omitted

The directory will contain member_*.pt files plus training_info.json.

2. Train a CreBNN prior-set model

python scripts/train_ensemble.py --config configs/crebnn_svhn.yaml --method crebnn

Optional CreBNN-specific overrides:

python scripts/train_ensemble.py \
  --config configs/crebnn_svhn.yaml \
  --method crebnn \
  --prior_scales 0.5,1.0,2.0 \
  --posterior_samples_per_prior 50

CreBNN training saves the learned prior-set model into model.checkpoint_dir/<method> by default and also writes crebnn_history.json.

Sanity Check Trained Checkpoints

For deterministic ensembles, you can check clean-set performance before running the full multi-view experiment:

python scripts/eval_clean.py \
  --config configs/default.yaml \
  --checkpoint_dir /path/to/checkpoints

You can also restrict evaluation to a small subset:

python scripts/eval_clean.py \
  --config configs/default.yaml \
  --checkpoint_dir /path/to/checkpoints \
  --subset_size 1000

Run the Main Fusion Evaluation

Full evaluation

python scripts/run_experiment.py --config configs/default.yaml

Examples for the other bundled configs:

python scripts/run_experiment.py --config configs/cifar100.yaml
python scripts/run_experiment.py --config configs/svhn.yaml
python scripts/run_experiment.py --config configs/crebnn_svhn.yaml

Debug or partial runs

Debug mode forces a 100-sample subset:

python scripts/run_experiment.py --config configs/default.yaml --debug

Override the number of repeated runs:

python scripts/run_experiment.py --config configs/default.yaml --n_runs 3

Run on a custom subset size:

python scripts/run_experiment.py --config configs/default.yaml --subset_size 1000

Result Structure

Each evaluation call creates a timestamped experiment directory:

outputs/<experiment_name>_<YYYYMMDD_HHMMSS>/

Inside that directory you will find:

  • config.yaml: exact config used for the run
  • run_0.json, run_1.json, ...: per-run summary metrics
  • run_0_detailed.npz, ...: per-sample detailed outputs
  • aggregate_results.json: mean and standard deviation across runs
  • <experiment_name>_<timestamp>.log: execution log
  • run_<k>_diagnostics/: CSV, LaTeX, PNG, and PDF diagnostics for run k

The diagnostic directory includes files such as:

  • fusion_method_comparison.csv
  • conflict_binned_evaluation.csv
  • conflict_vs_performance_curves.csv
  • corruption_vs_performance_curves.csv
  • table_a_overall_metrics.tex
  • table_b_conflict_summary.tex
  • table_c_conflict_binned.tex
  • conflict_vs_au_gap.png
  • conflict_vs_nll_diff.png
  • conflict_vs_selection_frequency.png
  • plot_conflict_bin_vs_accuracy.pdf
  • plot_conflict_bin_vs_selection_and_feasibility.pdf

Minimal Reproduction Recipes

CIFAR-10 deterministic pipeline

Train:

python scripts/train_ensemble.py --config configs/default.yaml --method credro

If training saved into a new subdirectory, update model.checkpoint_dir in configs/default.yaml to that directory.

Evaluate:

python scripts/run_experiment.py --config configs/default.yaml

Read:

outputs/credal_fusion_cifar10_<timestamp>/aggregate_results.json

CIFAR-100 deterministic pipeline

python scripts/train_ensemble.py --config configs/cifar100.yaml --method credro
python scripts/run_experiment.py --config configs/cifar100.yaml

SVHN CreBNN pipeline

python scripts/train_ensemble.py --config configs/crebnn_svhn.yaml --method crebnn
python scripts/run_experiment.py --config configs/crebnn_svhn.yaml

To export an example predictive credal tensor from a trained CreBNN model:

python scripts/example_crebnn.py \
  --config configs/crebnn_svhn.yaml \
  --max_samples 16 \
  --output_npz outputs/crebnn_predictive_set_example.npz

Key Config Fields To Check

When results do not match expectations, these are the first fields to verify in the chosen config:

  • experiment.name: controls the output directory prefix
  • experiment.n_runs: number of repeated runs
  • experiment.debug_subset: subset mode when set
  • data.dataset: cifar10, cifar100, or svhn
  • model.type: ensemble or crebnn
  • model.checkpoint_dir: where evaluation loads checkpoints from
  • model.architectures and model.ensemble_size: deterministic ensemble layout
  • credal.representation: box or convex_hull
  • fusion.conjunctive_epsilon: relaxed feasibility tolerance for conjunctive fusion
  • prediction.methods: point predictions extracted from each credal set

Testing

Run the unit tests with:

pytest tests/ -v

References

  • Caprio and Restuccia, "Credal Information Fusion"
  • Wang et al., "CreDRO: Learning Credal Ensembles via DRO"
  • Credal Bayesian Deep Learning / CreBNN

About

This repo has the codes for Decision-Driven Credal Information Fusion wrok

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages