Skip to content

Data loader and solution method for the DCASE 2024 Challenge Task1

License

Notifications You must be signed in to change notification settings

hubtru/ASCDomain

Repository files navigation

Hugging Face Spaces Open In Colab

ASCDomain: Domain Invariant Device-Adversarial Isotropic Knowledge Distillation Convolutional Neural Architecture

This repository is official Pytorch implementaion of ASCDomain (ICASSP 2025). It contains the code to reproduce the results of the Truchan_LUH submission to DCASE24 Task 1 "Data-Efficient Low-Complexity Acoustic Scene Classification" challenge.

  • ASCDomain Paper: here
  • Technical Report: here
  • System Results: here

The codebase for this repository is the baseline for task1: here

ASCDomain Architecture

ASCDomain integrates four key components: Preprocessing, Lightweight Isotropic Neural Network , Adversarial Domain Adaptation, and ensemble Knowledge-Distillation. Figure presents the ASCDomain workflow. Solid lines indicate the training and validation phase, and dashed lines indicate the train phase.

  • The inputs to the network are the audio snippets [1s], audio labels and device lables.
  • The audio snippets first go through preprocessing, transforming them into Mel-Spectograms [256 x 65].
  • The isotropic network extracts the features from the time-frequency representation.
  • Adversarial Domain Adaptation encourages embedding representation to become domain-invariant (genaralization across different recording devices).
  • Ensemble Knowledge Distillation improves learning efficiency by transferring knowledge from multiple high-performance teacher models to a compact student model.

figure

Setup

Create a conda environment

conda create -n asc python=3.10
conda activate asc

Download the dataset from this location and extract the files.

There are a total of 5 architectures:

  1. Isotropic
  2. Siren
  3. Adverserial
  4. RSC
  5. ASC Domain

Only Isotropic, Siren and RSC were submitted. In each experiment folder there is a dataset folder with a dcase24.py file, where the path to the datset has to be specified:

dataset_dir = None

All experiments have an argument split which specifies the corresponding split: 5, 10, 25, 50,100 are available

Device Impulse Response

The device impulse response augmentation has shown great success in previous submissions and is also used in in this submission. The device impulse responses are provided by MicIPR. All files are shared via Creative Commons license. All credits go to MicIRP & Xaudia.com.

Isotropic Architectures

Isotropic

Run isotropic training

python run_isotropic_training.py 

Run isotropic with mixstyle from here

python run_isotropic_training.py  --model=mix

Run isotropic without activation motivated from here

python run_isotropic_training.py  --model=noact

Navigate to Isotropic_HPO and run isotropic hyper parameter optimization.

python run_isotropic_hpo.py 

Isotropic Notebook Demonstrator Colab

Siren

Run siren training

python run_siren_training.py 

Siren Notebook Demonstrator Colab

Domain Generalization Techniques

Previous domain generalization techniques have used augmentation to generalization. For next two achitectures we conduct representation learning experiments with the isotropic architecture as a backbone model. Two representation learning techniques from DeepDG were chosen:

  • Domain Adverserial Neural Network (here called adverserial)
  • Representation Self Challenging (RSC)

Adverserial

Run adverserial training

python run_adv_training.py 

RSC

Run RSC training

python run_rsc_training.py 

ASC Domain

ASC Domain combines the adverserial approach with knowledge distillation. The training procedure and teacher models were taken from cpjku_dcase23 and EfficientAT. We train a total of 4 differenct architectures:

  • MobileNet
  • Dynamic MobileNet
  • CP-ResNet
  • PaSST each with different training setups and version of the architecture leading to a total of 22 teacher models. Each teacher model is trained on the 5 splits, resulting in 110 models.

Run teacher model training

MobileNet

Run MobileNet training. Argument width=[0.4, 0.5, 1.0]

python run_mn_training.py --width=0.4 

Dynamic MobileNet

Run Dynamic MobileNet training. Argument width=[0.4, 1.0]

python run_dymn_training.py --width=0.4 

PaSST

Run PaSST training

python run_passt_training.py 

CP-ResNet

Run CP-ResNet training

python run_cp-resnet_training.py

Teacher Student Single Training

Run single teacher student training with teacher and Isotropic as student

python run_convmixer_training.py --teacher=<teacher_name>

Example: Run single teacher student training with PaSST teacher and Isotropic as student

python run_convmixer_training.py --teacher=passt_dir_fms 

Teacher Student Ensemble Training

Run ensemble teacher student training with teacher and Isotropic as student

python run_convmixer_training.py --teacher=best

Run ensemble teacher student training with teacher and Siren as student

python run_siren_training.py --teacher=best

Teacher Student Ensemble Adverserial Training

Run ensemble teacher student adverserial training with teacher and Isotropic as student

python run_convmixer_adv_training.py --teacher=best

Run ensemble teacher student adverserial training with teacher and Siren as student

python run_siren_adv_training.py --teacher=best

Ensemble Selection

The ensemble is selected by a forward stepwise selection algorithm:

  1. Start with empty ensemble
  2. Add the model to the ensemble that minimizes the ensemble validation loss
  3. Repeat step 2 until no improvement can be achieved
  4. Return ensemble

The implementation of the ensemble selection can be seen in ensemble_selection.ipynb.

Citing

If you find ASCDomain useful for your work, please consider citing us as follows:

@inproceedings{truchan2025ascdomain,
  title={ASCDomain: Domain Invariant Device-Adversarial Isotropic Knowledge Distillation Convolutional Neural Architecture},
  author={Truchan, Hubert and Ngo, Tien Hung and Ahmadi, Zahra},
  booktitle={ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  pages={1--5},
  year={2025},
  organization={IEEE}
}