Skip to content

Commit a44d9c8

Browse files
cf-hermanGitHub Enterprise
authored andcommitted
TDL:2023.1 rebranding and editorial changes
1 parent 82d0118 commit a44d9c8

20 files changed

+655
-697
lines changed

Hardware_Acceleration/Design_Tutorials/01-convolution-tutorial/README.md

Lines changed: 14 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -6,27 +6,26 @@
66
</tr>
77
</table>
88

9-
109
# Accelerating Video Convolution Filtering Application
1110

1211
***Version: Vitis 2023.1***
1312

14-
This tutorial introduces you to a compute-intensive application that is accelerated using the Xilinx Alveo Data Center accelerator card. It goes through the design of a specific kernel that runs on the FPGA and briefly discusses optimization of the host-side application for performance. The kernel is designed to maximize throughput, and the host application is optimized to transfer data in an effective manner that moves in-between the host and FPGA card. The host application essentially eliminates the data movement latency by overlapping data transfers for multiple kernel calls. Another essential purpose of this tutorial is to show **_how one can easily estimate the performance of hardware kernels that can be built using Vitis HLS and how accurate and close these estimates are to actual hardware performance_**
13+
This tutorial introduces you to a compute-intensive application that is accelerated using the AMD Alveo Data Center accelerator card. It goes through the design of a specific kernel that runs on the field programmable gate array (FPGA) and briefly discusses optimization of the host-side application for performance. The kernel is designed to maximize throughput, and the host application is optimized to transfer data in an effective manner that moves in-between the host and FPGA card. The host application essentially eliminates the data movement latency by overlapping data transfers for multiple kernel calls. Another essential purpose of this tutorial is to show **how one can easily estimate the performance of hardware kernels that can be built using Vitis HLS and how accurate and close these estimates are to actual hardware performance**.
1514

1615
## Introduction to Acceleration
1716

18-
The first lab is designed to let you quickly experience the acceleration performance that can be achieved by porting the video filter to Xilinx's Alveo accelerator card. The Alveo series cards are designed for accelerating data center applications. However, this tutorial can be adapted to other accelerator cards with some simple changes.
17+
The first lab is designed to let you quickly experience the acceleration performance that can be achieved by porting the video filter to AMD's Alveo accelerator card. The Alveo series cards are designed for accelerating data center applications. However, this tutorial can be adapted to other accelerator cards with some simple changes.
1918

2019
The steps to be carried out for this first lab include:
2120

2221
- Setting up the Vitis application acceleration development flow
2322
- Running the hardware optimized accelerator and comparing its performance with a baseline of the application
2423

25-
This lab demonstrates the significant performance gain that can be achieved as compared to CPU performance. Whereas the next labs in this tutorial will illustrate and guide how such performance can be achieved using different optimizations and design techniques for 2D convolution kernels and the host side application.
24+
This lab demonstrates the significant performance gain that can be achieved as compared to the processor performance. The next labs in this tutorial will illustrate and guide how such performance can be achieved using different optimizations and design techniques for 2D convolution kernels and the host side application.
2625

2726
### Cloning the GitHub Repository and Setting Up the Vitis Tool
2827

29-
To run this tutorial you will need to clone a git repo and also download and extract some compressed files, please follow the instruction given below:
28+
To run this tutorial, you will need to clone a git repo and also download and extract some compressed files. Use the following instructions:
3029

3130
#### Clone Git Repo
3231

@@ -38,19 +37,19 @@ git clone https://github.com/Xilinx/Vitis-Tutorials.git
3837

3938
#### Copy and Extract Large Files
4039

41-
Copy and extract large files in convolution tutorial directory as follows:
40+
Copy and extract large files in the convolution tutorial directory as follows:
4241

4342
```bash
4443
cd /VITIS_TUTORIAL_REPO_PATH/Hardware_Acceleration/Design_Tutorials/01-convolution-tutorial
4544
wget https://www.xilinx.com/bin/public/openDownload?filename=conv_tutorial_files.tar.gz -O conv_tutorial_files.tar.gz
4645
tar -xvzf conv_tutorial_files.tar.gz
4746
```
4847

49-
**NOTE** : VITIS_TUTORIAL_REPO_PATH is the local directory path where git repo is cloned.
48+
>**NOTE:** VITIS_TUTORIAL_REPO_PATH is the local directory path where the git repo is cloned.
5049
5150
#### Setting Up the Vitis Tool
5251

53-
Setup the application build and runtime environment using the following commands as per your local installation:
52+
Set up the application build and runtime environment using the following commands as per your local installation:
5453

5554
```bash
5655
source <XILINX_VITIS_INSTALL_PATH>/settings64.sh
@@ -59,15 +58,15 @@ source <XRT_INSTALL_PATH>/setup.sh
5958

6059
### Baseline the Application Performance
6160

62-
The software application processes High Definition(HD) video frames/images with 1920x1080 resolution. It performs convolution on a set of images and prints the summary of performance results. It is used for measuring baseline software performance. Please the set the environment variable that points to tutorial direction relative to repo path as follow:
61+
The software application processes high definition (HD) video frames/images with 1920x1080 resolution. It performs convolution on a set of images and prints the summary of performance results. It is used for measuring baseline software performance. Set the environment variable that points to tutorial direction relative to the repo path as follows:
6362

6463
```bash
6564
export CONV_TUTORIAL_DIR=/VITIS_TUTORIAL_REPO_PATH/Hardware_Acceleration/Design_Tutorials/01-convolution-tutorial
6665
```
6766

68-
where **VITIS_TUTORIAL_REPO_PATH** is the local path where git repo is placed by the user after cloning.
67+
where **VITIS_TUTORIAL_REPO_PATH** is the local path where the git repo is placed by the user after cloning.
6968

70-
**NOTE**: Make sure during all of the labs in this tutorial you have set `CONV_TUTORIAL_DIR` variable appropriately
69+
>**NOTE:** Make sure during all of the labs in this tutorial you have set the `CONV_TUTORIAL_DIR` variable appropriately.
7170
7271
Run the application to measure performance as follows:
7372

@@ -76,7 +75,7 @@ cd $CONV_TUTORIAL_DIR/sw_run
7675
./run.sh
7776
```
7877

79-
Results similar to the ones shown below will be printed. Note down the CPU throughput.
78+
Results similar to the following will be printed. Note down the CPU throughput.
8079

8180
```bash
8281
----------------------------------------------------------------------------
@@ -97,14 +96,14 @@ CPU Throughput : 12.7112 MB/s
9796

9897
### Launching the Host Application
9998

100-
Now launch the application, which uses FPGA accelerated video convolution filter. The application will be run on an actual FPGA card, also called System Run.
99+
Now launch the application, which uses a FPGA accelerated video convolution filter. The application will be run on an actual FPGA card, also called System Run.
101100

102101
```bash
103102
cd $CONV_TUTORIAL_DIR
104103
make run
105104
```
106105

107-
The result summary will be similar to the one given below:
106+
The result summary will be similar to the following:
108107

109108
```bash
110109
----------------------------------------------------------------------------
@@ -136,15 +135,14 @@ FPGA Speedup : 68.1764 x
136135

137136
### Results
138137

139-
From the host application console output, it is clear that the FPGA accelerated kernel can outperform CPU-only implementation by a factor of 68x. It is a large gain in terms of performance over CPU. The following labs will illustrate how this performance allows processing more than 3 HD video channels with 1080p resolution in parallel. The tutorial describes how to achieve such performance gains by building a kernel and host application written in C++. The host application uses OpenCL APIs and Xilinx Runtime (XRT) underneath it, demonstrating how to unleash this custom-built hardware kernel's computing power effectively.
138+
From the host application console output, it is clear that the FPGA accelerated kernel can outperform CPU-only implementation by a factor of 68x. It is a large gain in terms of performance over CPU. The following labs will illustrate how this performance allows processing more than three HD video channels with 1080p resolution in parallel. The tutorial describes how to achieve such performance gains by building a kernel and host application written in C++. The host application uses OpenCL APIs and Xilinx Runtime (XRT) underneath it, demonstrating how to unleash this custom-built hardware kernel's computing power effectively.
140139

141140
---------------------------------------
142141

143142
<p align="center"><b>
144143
Next Lab Module: <a href="./lab1_app_introduction_performance_estimation.md">Video Convolution Filter : Introduction and Performance Estimation</a>
145144
</b></p>
146145

147-
148146
<p class="sphinxhide" align="center"><sub>Copyright © 2020–2023 Advanced Micro Devices, Inc</sub></p>
149147

150148
<p class="sphinxhide" align="center"><sup><a href="https://www.amd.com/en/corporate/copyright">Terms and Conditions</a></sup></p>

0 commit comments

Comments
 (0)