Atomic-RMW: GPU Microbenchmark

Introduction

This is a simple microbenchmark that calculates Atomic Read-Modify-Write (RMW) throughput. It allows you to configure various parameters such as thread contention, padding, RMW iterations, and the number of workgroups. You can also generate a heatmap of various combinations of contention and padding with the scripts provided. The program will output the atomic throughput (measured in atomic operations per microsecond), the duration (in microseconds), and a count of any kernel computation errors (check kernel validation function).

Prerequisites

Before you begin, ensure you have setup the following requirements:

Frameworks and Tools

Vulkan:
- Vulkan SDK
- clspv
- Android ADB / NDK
- make utility for building
- Python

Installation

To install the necessary dependencies, follow these steps:

Clone the repository:

git clone https://github.com/ucsc-chpl/gpu-atomic-rmw-microbenchmark.git

Build easyvk:

cd gpu-atomic-rmw-microbenchmark/easyvk/
git submodule update --init --recursive
make

Usage

To run the microbenchmark on your device, follow the series of commands:

Vulkan (Desktop)

Compilation

Navigate to the source directory:
```
cd src/
```
Compile the project:
```
make
```

Running the microbenchmark

Single configuration

To test a single configuration of the microbenchmark:

Run the microbenchmark from the src directory:
```
./atomic_rmw_test -w <workgroups> -d <device> -c <contention> -p <padding> -i <rmw_iterations>
```
- -w <workgroups>: (Required) The number of workgroups to use. Defaults to 1.
- -d <device>: (Optional) The index of the device to use. Defaults to 0.
- -c <contention>: (Optional) The number of threads contending on the same machine word. Defaults to 1.
- -p <padding>: (Optional) The number of machine words between those accessed. Defaults to 1.
- -i <rmw_iterations>: (Optional) The number of RMW iterations. Defaults to 128.

Multiple configurations

To test multiple configurations of thread contention and padding and produce a heatmap:

Run the bash script from the src directory:

./heatmap_results.sh <workgroups> <device> <rmw_iterations>

Generate a heatmap displaying results from the src directory:
```
python3 heatmap_generator.py
```
Example heatmap generation:

For a NVIDIA Geforce RTX 4070, the following parameters work well:
- workgroups: 46
- rmw_iterations: 4096

Vulkan (Android)

Compilation

Navigate to the source directory:
```
cd src/
```
Compile the project for Android:
```
make android
```

Running the microbenchmark

Get the serial number of the connected Android device:
```
adb devices
```

Get supported CPU ABIs:

adb -s [SERIAL_NUMBER] shell getprop ro.product.cpu.abilist
# If Android is pre-Lollipop version, use:
adb -s [SERIAL_NUMBER] shell getprop ro.product.cpu.abi

Copy necessary files:

cp *.cinit *.sh build/android/obj/local/[SUPPORTED_CPU]

Push files to the Android device:

adb -s [SERIAL_NUMBER] push build/android/obj/local/[SUPPORTED_CPU]/ /data/local/tmp/rmw

Navigate to microbenchmark on the Android device:

 adb -s [SERIAL_NUMBER] shell
 cd /data/local/tmp/rmw/[SUPPORTED_CPU]

Single configuration

To test a single configuration of the microbenchmark:

Run the microbenchmark from the [SUPPORTED_CPU] directory:
```
./atomic_rmw_test -w <workgroups> -d <device> -c <contention> -p <padding> -i <rmw_iterations>
```
- -w <workgroups>: (Required) The number of workgroups to use. Defaults to 1.
- -d <device>: (Optional) The index of the device to use. Defaults to 0.
- -c <contention>: (Optional) The number of threads contending on the same machine word. Defaults to 1.
- -p <padding>: (Optional) The number of machine words between those accessed. Defaults to 1.
- -i <rmw_iterations>: (Optional) The number of RMW iterations. Defaults to 128.

Multiple configurations

To test multiple configurations of thread contention and padding and produce a heatmap:

Run the bash script from the [SUPPORTED_CPU] directory:

sh heatmap_results.sh <workgroups> <device> <rmw_iterations>

Exit the shell and pull the results file from the Android device:

exit
adb -s [SERIAL_NUMBER] pull /data/local/tmp/rmw/[SUPPORTED_CPU]/result.txt .

Generate a heatmap displaying results from the src directory:
```
python3 heatmap_generator.py
```
Example heatmap generation:

For a Samsung Xclipse 920, the following parameters work well:
- workgroups: 3
- rmw_iterations: 32768

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
easyvk @ 219627d		easyvk @ 219627d
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Atomic-RMW: GPU Microbenchmark

Table of Contents

Introduction

Prerequisites

Frameworks and Tools

Installation

Usage

Vulkan (Desktop)

Compilation

Running the microbenchmark

Single configuration

Multiple configurations

Vulkan (Android)

Compilation

Running the microbenchmark

Single configuration

Multiple configurations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Atomic-RMW: GPU Microbenchmark

Table of Contents

Introduction

Prerequisites

Frameworks and Tools

Installation

Usage

Vulkan (Desktop)

Compilation

Running the microbenchmark

Single configuration

Multiple configurations

Vulkan (Android)

Compilation

Running the microbenchmark

Single configuration

Multiple configurations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages