Skip to content

CompArchCam/ghost-threading-bmk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ghost Threading Microbenchmark

Table of Contents

Overview

This repository contains a microbenchmark to test and evaluate the performance of a compiler performing automatic Ghost Threading (GT) on modern processors. Ghost Threading is a hardware and software mechanism that allows the execution of additional threads in parallel with the main threads without interfering with the primary computation. Our automated Ghost Threading compiler can extract relevant regions of code and perform prefetching using a helper thread. This benchmark evaluates how well such mechanisms perform on both single-core and multi-core architectures, with an emphasis on the compiler for automated extraction of the source code.

The benchmark is designed to run on systems that support multi-threading and provides performance metrics such as throughput, latency, and resource utilization.

Key Features:

  • Automatically create helper threads with prefetch instructions: The compiler automatically generates helper threads with prefetch instructions to enhance data locality.
  • Algorithm to copy the original loop into the prefetching thread: The benchmark includes an algorithm that copies the original computational loop into the helper thread, minimizing it by removing all unnecessary code, ensuring that only relevant data is prefetched.
  • Hyper-parameters passed as pragma: User can customize the behavior of the compiler with hyper-parameters passed directly to the pragmas, controlling the granularity and execution model of Ghost Threading.
  • Scripts to compile and run kernels: The repository includes scripts to compile and execute kernels with automatic ghost threading, manual ghost threading, and baseline execution modes, facilitating easy comparison between different strategies.
  • Result collection: Scripts to automatically collects performance metrics for easy analysis.

Installation

Prerequisites

To run the microbenchmark, you will need the following tools and dependencies:

  • x86-64 machine A physical system with x86-64 Intel processor which supports Hyper-Threading and has the serialize instruction. At least 94GB memory is needed to build the input graphs for some benchmarks. Around 290GB disk space is needed for the input graphs.
  • Ghost Threading Compiler (Get from here)
  • CMake (version 3.20+) for the compiler
  • Python (version 3.10+)
  • libpthread (for multi-threaded execution)
  • OpenMP runtime to compare against parallel processing

You can install the required dependencies using the following commands:

For Ubuntu/Debian:

sudo apt-get update
sudo apt-get install python3 llvm clang lld libpthread-stubs0-dev ninja-build cmake
python3 -m ensurepip --upgrade
pip install -r requirements.txt

For macOS:

brew install llvm lld python libpthread-stubs ninja cmake
python3 -m ensurepip --upgrade
pip install -r requirements.txt

Getting the Ghost Threading compiler

Clone the repository:

git clone https://github.com/CompArchCam/GhostThreadingCompiler.git

Build the compiler

cd GhostThreadingCompiler 
mkdir build && cd build 
export LLVMDIR="/llvm/install/path"
cmake -G Ninja \
  -DCMAKE_INSTALL_PREFIX="${LLVMDIR}" \
  -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ \
  -DCMAKE_BUILD_TYPE=Release \
  -DLLVM_OPTIMIZED_TABLEGEN=On \
  -DLLVM_ENABLE_PROJECTS="clang;lld;openmp" \
  -DLLVM_TARGETS_TO_BUILD="X86" \
  -DLLVM_PARALLEL_COMPILE_JOBS=6 \
  -DLLVM_PARALLEL_LINK_JOBS=4 \
  -DLLVM_USE_LINKER=lld \
  ../llvm

ninja install
export PATH="${LLVMDIR}/bin:$PATH"

Testing Installation

Once installed, you can verify the installation by running the following command:

opt --help-hidden | grep ghostthreading
clang --version

This should print the version of the compiler that you have installed.

Running the Benchmark

Clone the repository:

git clone https://github.com/CompArchCam/ghost-threading-bmk.git

Usage

Running HPC kernels

First set proper CPU ids (cpu_m and cpu_s) in the config.sh script.

export LLVMDIR="/llvm/install/path"
cd ghost-threading-bmk/workdir
vim config.sh

cd htpf
./build_htpf.sh -t gt -a exec -k camel -n 3

Running GAP kernels

export LLVMDIR="/llvm/install/path"
cd ghost-threading-bmk/workdir/gap
./build_gap.sh -t gt -a exec -k cc -g kron -n 3

Results

Measuring speedup for HPC kernels

cd ghost-threading-bmk/workdir/htpf
./build_htpf.sh -a exec -n 3
python results.py output

Measuring speedup for gap cc kernel

cd ghost-threading-bmk/workdir/gap
./build_gap.sh -a exec -k cc -g kron -n 3
python results.py output

To run all the kernels, do not pass -k and -g flags. This will take considerable amount of time for all the kernels and the three techniques gt (automatic ghost threading), tpf (manual ghost threading), and baseline (with -O3).

Contribute

We welcome contributions from the community! If you want to improve the benchmark or add new features, follow these steps:

  • Fork the repository.
  • Create a new branch (git checkout -b feature-name).
  • Implement your feature or bug fix.
  • Run the benchmarks and ensure all tests pass.
  • Commit your changes and push them to your fork.
  • Create a pull request describing your changes.

License

MIT License

About

Ghost-threading benchmarks to evaluate the compiler.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published