This repository contains the training, testing and simulator code used to build the DUMBO system from the research paper Taming the Elephants: Affordable Flow Length Prediction in the Data Plane by R. Azorin, A. Monterubbiano, G. Castellano, M. Gallo, S. Pontarelli and D. Rossi, accepted at CoNEXT'24.
DUMBO is a versatile networked system that integrates a lightweight traffic classifier to enhance several downstream tasks in the data plane (e.g., packet scheduling, inter-arrival times distribution estimation, flow length estimation). The main idea of DUMBO is to segregate elephants and mice flows to address them separately, hence saving memory and improving performance over standard baselines.
This document serves as a guide to install and use the DUMBO system on real traffic traces. Follow these instructions to quickly set up the repository and reproduce the experiments.
-
This project requires a machine with > 1TB of disk space. If you lack space, you may first experiment with the UNI traces that are the smallest ones.
-
This project requires a machine with >100 GB of RAM. If you lack RAM, you may consider decreasing the number of jobs executed in parallel during model training and use cases simulations.
Here are the traffic traces used in the experiments.
- Trace: equinix Chicago dir.A 2016-01-21 13:00 - 13:59
- Link: https://www.caida.org/catalog/datasets/passive_dataset_download/ (approval required by CAIDA)
- Trace: 2019-04-09 18:30 - 19:45
- Link: https://mawi.wide.ad.jp/mawi/ditl/ditl2019/
- Trace: UNI2 2010-01-22 20:02 - 22:40
- Link: https://pages.cs.wisc.edu/~tbenson/IMC_DATA/univ2_trace.tgz
To reproduce the experiments from the paper, download the traces, uncompress and store the *.pcap
files in the appropriate folders:
./data/caida/pcap/equinix-chicago.dirA.20160121-{hour}.UTC.anon.pcap
./data/mawi/pcap/20190409{hour}.pcap
./data/uni/pcap/univ2_pt{part}
Choose one on the following options to install the project.
-
Install Docker, and build the image
$ docker build -t dumbo .
-
Create the container
$ docker run -it -p 8888:8888 dumbo
This project runs on Linux (Ubuntu version >= 22).
-
$ sudo apt-get install wireshark-common
-
Install Python 3.9 outside of any virtual environment
$ sudo apt update $ sudo apt install python3.9 $ python --version
-
Install and setup Rust
- Use
v1.76.0-nightly
(nightly-2024-02-08) and check your version:
$ cargo --version
- Install the
libpython3.9-dev
package on your system:
$ sudo apt install libpython3.9-dev
- Deactivate any virtual environment and build the repository:
$ cargo build -r
- Use
-
Install conda, and create the required environments
$ chmod +x ./setup_conda.sh $ ./setup_conda.sh
-
Clone and patch the YAPS simulator repository
$ git clone -n https://github.com/NetSys/simulator.git $ cd simulator $ git checkout -b scheduling_DUMBO 179b64e $ git apply < ../scheduling_DUMBO.patch $ cd ..
Run the pipeline to reproduce the experiments on the various traces. Note that this may take several hours.
$ chmod +x ./run.sh
$ ./run.sh caida
$ ./run.sh mawi
$ ./run.sh uni
Additionally, run the model update experiment. Note that this requires complete caida and mawi runs.
$ chmod +x ./run_update_stresstest.sh
$ ./run_update_stresstest.sh
Plot the results using the notebooks in ./plots/
If you used Docker to install the project:
- Run the following inside your container to launch Jupyter:
$ conda activate /DUMBO/conda_envs/training_env $ jupyter notebook --ip 0.0.0.0 --no-browser
- Access Jupyter at
localhost:8888
in your web browser thanks to port forwarding.
You can find additional technical documentation about the simulators in ./README_SIMULATOR.md
and ./README_DEV.md
.
If you have found this paper useful, please cite us using:
@article{dumbo2024,
title={Taming the Elephants: Affordable Flow Length Prediction in the Data Plane},
author={Azorin, Raphael and Monterubbiano, Andrea and Castellano, Gabriele and Gallo, Massimo and Pontarelli, Salvatore and Rossi, Dario},
journal={Proceedings of the ACM on Networking},
volume={2},
number={CoNEXT1},
articleno = {5},
numpages={24},
year={2024},
publisher={ACM New York, NY, USA}
}
We would like to thank the authors of pHost and of the YAPS simulator as well as the author of the MetaCost learning implementation.