Skip to content

Commit 21a4116

Browse files
Merge pull request #302 from Exabyte-io/docs/SOF-7534
SOF-7534: QE GPU tutorial
2 parents 1c3409f + a4ba707 commit 21a4116

File tree

9 files changed

+308
-36
lines changed

9 files changed

+308
-36
lines changed

.github/workflows/build-tests.yml

Lines changed: 17 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,6 @@ jobs:
1515
matrix:
1616
os: ["ubuntu-24.04"]
1717
python-version:
18-
- "3.8"
1918
- "3.9"
2019
- "3.10"
2120
- "3.11"
@@ -45,22 +44,22 @@ jobs:
4544
if: (github.repository != 'Exabyte-io/template-definitions-js-py') && (github.ref_name == 'master')
4645

4746
steps:
48-
- name: Checkout this repository
49-
uses: actions/checkout@v4
50-
with:
51-
lfs: true
47+
- name: Checkout this repository
48+
uses: actions/checkout@v4
49+
with:
50+
lfs: true
5251

53-
- name: Checkout actions repository
54-
uses: actions/checkout@v4
55-
with:
56-
repository: Exabyte-io/actions
57-
token: ${{ secrets.BOT_GITHUB_TOKEN }}
58-
path: actions
52+
- name: Checkout actions repository
53+
uses: actions/checkout@v4
54+
with:
55+
repository: Exabyte-io/actions
56+
token: ${{ secrets.BOT_GITHUB_TOKEN }}
57+
path: actions
5958

60-
- name: Publish python release
61-
uses: ./actions/py/publish
62-
with:
63-
python-version: 3.9.x
64-
github-token: ${{ secrets.BOT_GITHUB_TOKEN }}
65-
publish-tag: 'true'
66-
publish-to-pypi: 'false'
59+
- name: Publish python release
60+
uses: ./actions/py/publish
61+
with:
62+
python-version: "3.10"
63+
github-token: ${{ secrets.BOT_GITHUB_TOKEN }}
64+
publish-tag: "true"
65+
publish-to-pypi: "false"

.github/workflows/s3-deploy.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,9 @@ name: Update S3 deploy
33
on:
44
push:
55
branches:
6-
- 'master'
6+
- "master"
77
schedule:
8-
- cron: '0 0 1 1 *'
8+
- cron: "0 0 1 1 *"
99
workflow_dispatch:
1010

1111
jobs:
@@ -26,7 +26,7 @@ jobs:
2626
- name: Set python 3 version
2727
uses: actions/setup-python@v5
2828
with:
29-
python-version: "3.8"
29+
python-version: "3.10"
3030

3131
- name: Build pages
3232
uses: Exabyte-io/action-mkdocs-build@main

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66

77
For a quick installation:
88

9-
1. Install dependencies: python 3 (tested on Python `3.8`-`3.13`), `pip`, `curl`, [`virtualenv`](https://virtualenv.pypa.io/en/latest/installation/), git, [git-lfs](https://git-lfs.github.com/).
9+
1. Install dependencies: python 3 (tested on Python `3.9`-`3.13`), `pip`, `curl`, [`virtualenv`](https://virtualenv.pypa.io/en/latest/installation/), git, [git-lfs](https://git-lfs.github.com/).
1010

1111
2. Clone this repository:
1212

Lines changed: 3 additions & 0 deletions
Loading
Lines changed: 174 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,174 @@
1+
{
2+
"descriptionLinks": [
3+
"Accelerate Quantum ESPRESSO simulation with GPUs: https://docs.mat3ra.com/tutorials/jobs-cli/qe-gpu/"
4+
],
5+
"description": "We walk through a step-by-step example of running a Quantum ESPRESSO job on a GPU enabled node. We see significant performance improvement by using CUDA/GPU-enabled version of Quantum ESPRESSO.",
6+
"tags": [
7+
{
8+
"...": "../../metadata/general.json#/tags"
9+
},
10+
{
11+
"...": "../../models-directory/dft.json#/tags"
12+
},
13+
{
14+
"...": "../../software-directory/modeling/quantum-espresso.json#/tags"
15+
},
16+
"CUDA",
17+
"GPU",
18+
"NVIDIA"
19+
],
20+
"title": "Mat3ra Tutorial: Accelerate Quantum ESPRESSO simulation with GPUs",
21+
"youTubeCaptions": [
22+
{
23+
"text": "Hello, and welcome to the matera tutorial series.",
24+
"startTime": "00:00:00.000",
25+
"endTime": "00:00:03.000"
26+
},
27+
{
28+
"text": "In today's tutorial, we will go through a step-by-step example of running a Quantum ESPRESSO simulation on one of our GPU enabled compute nodes.",
29+
"startTime": "00:00:04.000",
30+
"endTime": "00:00:14.000"
31+
},
32+
{
33+
"text": "We will see how we can dramatically improve the performance of our simulation using GPUs.",
34+
"startTime": "00:00:15.000",
35+
"endTime": "00:00:20.000"
36+
},
37+
{
38+
"text": "At the moment, GPU build of Quantum ESPRESSO is only available via our command line interface, and soon it will be made available in the web interface.",
39+
"startTime": "00:00:21.000",
40+
"endTime": "00:00:30.000"
41+
},
42+
{
43+
"text": "Let's connect to the login node using SSH.",
44+
"startTime": "00:00:31.000",
45+
"endTime": "00:00:34.000"
46+
},
47+
{
48+
"text": "You can use your terminal application and type S S H, your username at login dot matera dot com and press enter.",
49+
"startTime": "00:00:35.000",
50+
"endTime": "00:00:41.000"
51+
},
52+
{
53+
"text": "If you need help on how to set up S S H, please visit our documentation site at docs dot matera dot com, and search S S H.",
54+
"startTime": "00:00:42.000",
55+
"endTime": "00:00:51.000"
56+
},
57+
{
58+
"text": "Here you will find step by step guide to setup S S H key for seamless authentication.",
59+
"startTime": "00:00:52.000",
60+
"endTime": "00:00:57.000"
61+
},
62+
{
63+
"text": "Note that it is also possible to connect to the login node from our web platform using the web terminal.",
64+
"startTime": "00:00:58.000",
65+
"endTime": "00:01:04.000"
66+
},
67+
{
68+
"text": "Besides, <break time='0.5'/> it is also possible to run a command line job via bash workflow in our web platform.",
69+
"startTime": "00:01:05.000",
70+
"endTime": "00:01:12.000"
71+
},
72+
{
73+
"text": "Create a new workflow. Select shell script as application.",
74+
"startTime": "00:01:13.000",
75+
"endTime": "00:01:16.000"
76+
},
77+
{
78+
"text": "Add an execution unit and write your job script.",
79+
"startTime": "00:01:17.000",
80+
"endTime": "00:01:20.000"
81+
},
82+
{
83+
"text": "For now, let's focus on the command line part.",
84+
"startTime": "00:01:22.000",
85+
"endTime": "00:01:24.000"
86+
},
87+
{
88+
"text": "The example calculation we are going to demonstrate is available in our github repository C L I job examples.",
89+
"startTime": "00:01:25.000",
90+
"endTime": "00:01:33.000"
91+
},
92+
{
93+
"text": "Please browse under espresso, then gpu, where you will find required input and reference output files.",
94+
"startTime": "00:01:34.000",
95+
"endTime": "00:01:39.000"
96+
},
97+
{
98+
"text": "Once connected to the login node, let's navigate to your working directory, and clone our example repository.",
99+
"startTime": "00:01:40.000",
100+
"endTime": "00:01:47.000"
101+
},
102+
{
103+
"text": "After cloning the repository, we also need to sync the L F S objects with git L F S pull.",
104+
"startTime": "00:01:50.000",
105+
"endTime": "00:01:56.000"
106+
},
107+
{
108+
"text": "Let's navigate to our GPU example.",
109+
"startTime": "00:01:57.000",
110+
"endTime": "00:02:00.000"
111+
},
112+
{
113+
"text": "Let's examine the P B S job script.",
114+
"startTime": "00:02:03.000",
115+
"endTime": "00:02:05.000"
116+
},
117+
{
118+
"text": "We will run our job in GPU enabled G O F queue, we will request one node which has eight CPUs.",
119+
"startTime": "00:02:07.000",
120+
"endTime": "00:02:13.000"
121+
},
122+
{
123+
"text": "To run quantum espresso jobs in GPUs, we need to load the CUDA build of quantum espresso.",
124+
"startTime": "00:02:14.000",
125+
"endTime": "00:02:19.000"
126+
},
127+
{
128+
"text": "We set eight open M P threads and 1 M P I per GPU.",
129+
"startTime": "00:02:20.000",
130+
"endTime": "00:02:24.000"
131+
},
132+
{
133+
"text": "We can also set parallelization options for k point and matrix diagonalization.",
134+
"startTime": "00:02:25.000",
135+
"endTime": "00:02:30.000"
136+
},
137+
{
138+
"text": "Finally, we can submit our job with Q sub command. We can find the status of job with Q stat.",
139+
"startTime": "00:02:31.000",
140+
"endTime": "00:02:37.000"
141+
},
142+
{
143+
"text": "Once the job is completed, we can examine the output file.",
144+
"startTime": "00:02:38.000",
145+
"endTime": "00:02:41.000"
146+
},
147+
{
148+
"text": "We will see that the GPU acceleration was enabled for the calculation.",
149+
"startTime": "00:02:44.000",
150+
"endTime": "00:02:49.000"
151+
},
152+
{
153+
"text": "If we scroll to the bottom of the file, we will see the total time taken by the program. The wall time for this job was slightly less than a minute.",
154+
"startTime": "00:02:50.000",
155+
"endTime": "00:02:58.000"
156+
},
157+
{
158+
"text": "For comparison, we ran the same job using eight CPUs but without GPU acceleration, <break time='0.5'/> it took about 20 times longer.",
159+
"startTime": "00:03:02.000",
160+
"endTime": "00:03:10.000"
161+
},
162+
{
163+
"text": "Now you may test different combination of M P I and open M P threads, different parallelization option, and see what gives you the best performance.",
164+
"startTime": "00:03:11.000",
165+
"endTime": "00:03:20.000"
166+
},
167+
{
168+
"text": "Thank you for watching this tutorial and using our platform.",
169+
"startTime": "00:03:21.000",
170+
"endTime": "00:03:24.000"
171+
}
172+
],
173+
"youTubeId": "trLDEwWc3ho"
174+
}
Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
---
2+
tags:
3+
- GPU
4+
- CUDA
5+
hide:
6+
- tags
7+
---
8+
# Accelerate Quantum ESPRESSO simulation with GPUs
9+
10+
We will walk through a step-by-step example of running a Quantum ESPRESSO job on
11+
GPUs. As of the time of writing, the GPU (CUDA) build of Quantum ESPRESSO is
12+
only available via the Command Line Interface (CLI). We will see that we can
13+
dramatically speedup our Quantum ESPRESSO simulation by using GPUs.
14+
15+
1. First connect to login node via [SSH client](../../remote-connection/ssh.md),
16+
or [web terminal](../../remote-connection/web-terminal.md). Note that it is also
17+
possible to run CLI jobs by creating a [bash workflow](
18+
../../software-directory/scripting/shell/overview.md).
19+
20+
![Wen Terminal](../../images/jobs-cli/open-web-terminal.webp)
21+
22+
2. Example job that we are going to run is available in git repository
23+
[exabyte-io/cli-job-examples](https://github.com/exabyte-io/cli-job-examples).
24+
You may clone the repository to your working directory:
25+
```bash
26+
git clone https://github.com/exabyte-io/cli-job-examples
27+
cd cli-job-examples
28+
git lfs pull
29+
cd espresso/gpu
30+
```
31+
32+
3. You will find all required input files and job script under `espresso/gpu`.
33+
Please review the input files and PBS job script, update the project name, and
34+
other parameters as necessary.
35+
36+
4. We will use [GOF](../../infrastructure/clusters/aws.md#hardware-specifications)
37+
queue, which comprises 8 CPUs and 1 NVIDIA V100 GPU per node.
38+
39+
5. Since our compute node contains 8 CPUs with 1 GPU, we will run 1 MPI process
40+
with 8 OpenMP threads.
41+
```bash
42+
module load espresso/7.4-cuda-12.4-cc-70
43+
export OMP_NUM_THREADS=8
44+
mpirun -np 1 pw.x -npool 1 -ndiag 1 -in pw.cuo.scf.in > pw.cuo.gpu.scf.out
45+
```
46+
47+
6. Finally, we can submit our job using:
48+
```bash
49+
qsub job.gpu.pbs
50+
```
51+
52+
7. Once, the job is completed, we can inspect the output file `pw.cuo.gpu.scf.out`.
53+
We will see that GPU was used, and the job took about 1 minute wall time.
54+
```
55+
Parallel version (MPI & OpenMP), running on 8 processor cores
56+
Number of MPI processes: 1
57+
Threads/MPI process: 8
58+
...
59+
60+
GPU acceleration is ACTIVE. 1 visible GPUs per MPI rank
61+
GPU-aware MPI enabled
62+
...
63+
64+
Parallel routines
65+
66+
PWSCF : 37.94s CPU 50.77s WALL
67+
```
68+
69+
8. For comparison, we ran the same calculation using only CPUs, and it took
70+
about 20 times longer.
71+
```
72+
Parallel version (MPI), running on 8 processors
73+
74+
MPI processes distributed on 1 nodes
75+
...
76+
77+
Parallel routines
78+
79+
PWSCF : 18m 0.56s CPU 18m25.33s WALL
80+
```
81+
82+
You may experiment different combinations of MPI and OpenMP, various
83+
[parallelization options](https://www.quantum-espresso.org/Doc/user_guide/node20.html),
84+
and find what gives you the best performance.
85+
86+
## Step-by-step screenshare video
87+
88+
<div class="video-wrapper">
89+
<iframe class="gifffer" width="100%" height="100%" src="https://www.youtube.com/embed/trLDEwWc3ho" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
90+
</div>

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -153,6 +153,7 @@ nav:
153153
- Overview: tutorials/jobs-cli/overview.md
154154
- Create + run a CLI Job: tutorials/jobs-cli/job-cli-example.md
155155
- Import a CLI Job to Web Interface: tutorials/jobs-cli/cli-job-import.md
156+
- QE GPU Job: tutorials/jobs-cli/qe-gpu.md
156157
- Templating:
157158
- Overview: tutorials/templating/overview.md
158159
- Flags by Elemental Composition: tutorials/templating/set-flag-by-composition.md

netlify.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,5 +3,5 @@
33
publish = "site/"
44

55
[build.environment]
6-
PYTHON_VERSION = "3.8"
6+
PYTHON_VERSION = "3.10"
77
NODE_VERSION = "20"

0 commit comments

Comments
 (0)