forked from abacusmodeling/abacus-develop
-
Notifications
You must be signed in to change notification settings - Fork 145
Open
Labels
PerformanceIssues related to fail running ABACUSIssues related to fail running ABACUS
Description
Details
The main bottleneck lies in the computing process of the CPU, as most of the computing time is consumed by CPU calculations, while the GPU operates for less than 20% of the total time. Regarding multi-GPU computing, I have correctly compiled the CUDA version of ELPA. However, practical tests show that the parallel acceleration of multi-GPU systems mainly stems from the proportionally increased number of CPU cores allocated. Specifically, scaling up from a configuration of [6 cores paired with 1 V100 SXM2 16GB GPU] to [24 cores paired with 4 V100 SXM2 16GB GPUs] may achieve a 2.5x speedup; even using 24 cores with a single V100 SXM2 16GB GPU can result in a speedup of more than 2x.
Metadata
Metadata
Assignees
Labels
PerformanceIssues related to fail running ABACUSIssues related to fail running ABACUS