Skip to content
/ DUMBO Public
forked from cpt-harlock/DUMBO

Integrate a lightweight traffic classifier to enhance packet scheduling, inter-arrival times distribution estimation and flow length estimation.

License

Notifications You must be signed in to change notification settings

Raphaaal/DUMBO

 
 

Repository files navigation

This repository contains the training, testing and simulator code used to build the DUMBO system from the research paper Taming the Elephants: Affordable Flow Length Prediction in the Data Plane by R. Azorin, A. Monterubbiano, G. Castellano, M. Gallo, S. Pontarelli and D. Rossi, accepted at CoNEXT'24.

dumbo_intro_fig

DUMBO is a versatile networked system that integrates a lightweight traffic classifier to enhance several downstream tasks in the data plane (e.g., packet scheduling, inter-arrival times distribution estimation, flow length estimation). The main idea of DUMBO is to segregate elephants and mice flows to address them separately, hence saving memory and improving performance over standard baselines.

Guide

This document serves as a guide to install and use the DUMBO system on real traffic traces. Follow these instructions to quickly set up the repository and reproduce the experiments.

  • This project requires a machine with > 1TB of disk space. If you lack space, you may first experiment with the UNI traces that are the smallest ones.

  • This project requires a machine with >100 GB of RAM. If you lack RAM, you may consider decreasing the number of jobs executed in parallel during model training and use cases simulations.

1. Data

Here are the traffic traces used in the experiments.

CAIDA

MAWI

UNI

To reproduce the experiments from the paper, download the traces, uncompress and store the *.pcap files in the appropriate folders:

  • ./data/caida/pcap/equinix-chicago.dirA.20160121-{hour}.UTC.anon.pcap
  • ./data/mawi/pcap/20190409{hour}.pcap
  • ./data/uni/pcap/univ2_pt{part}

2. Installation

Choose one on the following options to install the project.

Option 1: Docker

  • Install Docker, and build the image

    $ docker build -t dumbo .
  • Create the container

    $ docker run -it -p 8888:8888 dumbo

Option 2: manual installation

This project runs on Linux (Ubuntu version >= 22).

  • Install mergecap and editcap

    $ sudo apt-get install wireshark-common
  • Install Python 3.9 outside of any virtual environment

    $ sudo apt update
    $ sudo apt install python3.9
    $ python --version
  • Install and setup Rust

    1. Use v1.76.0-nightly (nightly-2024-02-08) and check your version:
    $ cargo --version
    1. Install the libpython3.9-dev package on your system:
    $ sudo apt install libpython3.9-dev
    1. Deactivate any virtual environment and build the repository:
    $ cargo build -r
    
  • Install conda, and create the required environments

    $ chmod +x ./setup_conda.sh
    $ ./setup_conda.sh
  • Clone and patch the YAPS simulator repository

    $ git clone -n https://github.com/NetSys/simulator.git
    $ cd simulator
    $ git checkout -b scheduling_DUMBO 179b64e
    $ git apply < ../scheduling_DUMBO.patch
    $ cd ..

3. Run

Run the pipeline to reproduce the experiments on the various traces. Note that this may take several hours.

$ chmod +x ./run.sh
$ ./run.sh caida
$ ./run.sh mawi
$ ./run.sh uni

Additionally, run the model update experiment. Note that this requires complete caida and mawi runs.

$ chmod +x ./run_update_stresstest.sh
$ ./run_update_stresstest.sh 

4. Plot

Plot the results using the notebooks in ./plots/

If you used Docker to install the project:

  • Run the following inside your container to launch Jupyter:
    $ conda activate /DUMBO/conda_envs/training_env
    $ jupyter notebook --ip 0.0.0.0 --no-browser
  • Access Jupyter at localhost:8888 in your web browser thanks to port forwarding.

Documentation

You can find additional technical documentation about the simulators in ./README_SIMULATOR.md and ./README_DEV.md.

Citation

If you have found this paper useful, please cite us using:

@article{dumbo2024,
  title={Taming the Elephants: Affordable Flow Length Prediction in the Data Plane},
  author={Azorin, Raphael and Monterubbiano, Andrea and Castellano, Gabriele and Gallo, Massimo and Pontarelli, Salvatore and Rossi, Dario},
  journal={Proceedings of the ACM on Networking},
  volume={2},
  number={CoNEXT1},
  articleno = {5},
  numpages={24},
  year={2024},
  publisher={ACM New York, NY, USA}
}

Ackowledgements

We would like to thank the authors of pHost and of the YAPS simulator as well as the author of the MetaCost learning implementation.

About

Integrate a lightweight traffic classifier to enhance packet scheduling, inter-arrival times distribution estimation and flow length estimation.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Rust 35.3%
  • Python 30.7%
  • Jupyter Notebook 24.7%
  • Shell 9.0%
  • Dockerfile 0.3%