Skip to content

Slaymish/malware-classifier-backdoors

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

526 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Malware Classifier Backdoor Attacks and Defenses

Read the paper here!

https://doi.org/10.5281/zenodo.17274437

Resilient by Design: Investigating Backdoor Vulnerabilities in Malware Detection Systems. This report presents a large scale empirical study of backdoor (data poisoning) attacks against ML-based malware detectors. Across 420 experiments spanning poisoning ratios, trigger types and model architectures, the work finds that static malware classifiers exhibit surprisingly strong natural resistance to backdoor attacks, with attack success rates generally below 4%. Tree-based models (LightGBM) show superior robustness compared with neural networks, and simple defences such as ensemble averaging and clean-tuning further reduce attack effectiveness. The study draws on the EMBER feature pipeline and provides code and experimental details to reproduce the results.

Steps

  1. Poison the Training Data: Inject backdoor samples into the dataset.
  2. Train the Model: Train a malware classifier on the poisoned dataset.
  3. Test on Clean Data: Evaluate the model’s performance on unpoisoned data.
  4. Test on Backdoor Data: Assess the model’s vulnerability to backdoor samples.

Setup Instructions

  1. Update and rebuild the container
./update_build.sh
# this will remove the existing container and build a new one
# and will run and enter the container
  1. Run the unit tests
python -m unittest discover -s scripts/unit_tests
# or ./unit_tests.sh
  1. Execute the pipeline detailed below.

Pipeline

  1. Create config.yaml file
  2. Run the pipeline:
python -m scripts.pipeline --config config.yaml --log data/pipeline.log &
#  or ./run_pipeline.sh

Grid Search

python scripts/grid_search.py --grid_search

Testing Details

The test suite now generates two types of confusion matrices:

  1. A standard confusion matrix with three categories (benign, malicious, and backdoor malicious).
  2. A simplified “square” confusion matrix focusing on benign vs. malicious only. It also calculates updated metrics (Accuracy, Precision, Recall, F1 Score, ROC AUC) for each variant, providing a more detailed view of how the model performs against backdoored samples.

The test suite evaluates the trained model across the following data types:

  • Clean Data:
    • Unpoisoned benign samples
    • Unpoisoned malicious samples
  • Poisoned Data:
    • Poisoned benign samples
    • Poisoned malicious samples

Metrics:

The test suite provides the following evaluation metrics:

  • Accuracy
  • Precision
  • Recall
  • F1 Score
  • ROC AUC

Visualizations:

The following plots are generated during testing:

  • Confusion Matrix
  • ROC Curve

Data Structure

The data is organized into the following directories:

data/
├── raw/ # Contains unprocessed executables
│ ├── benign/
│ └── malicious/
├── poisoned/ # Contains poisoned executables
│ ├── <backdoor_name>/
      |── benign/
      |── malicious/
│ └── <backdoor_name>/
└── ember/ # Contains the poisoned dataset in EMBER format
  ├── test.jsonl
  ├── train.jsonl

Reference

Please use the below .bib to cite the research report!

@article{burke2025resilient,
  author = {Hamish Burke},
  title = {Resilient by Design: Investigating Backdoor Vulnerabilities in Malware Detection Systems},
  year = {2025},
  month = {January},
  institution = {Victoria University of Wellington},
}

About

Implementation of backdoor attacks and defenses in malware classification using machine learning models.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 97.7%
  • Shell 1.9%
  • Dockerfile 0.4%