Skip to content

My LCAO HSE calculations are slow outside of the SCF iterations. #6309

Open
@pJahad

Description

@pJahad

Details

I'm running calculations using the input files from the ABACUS Test Report for HSE (#6294), but the speed is very slow for everything except the SCF iterations. For example, with Si, the LOCAL POTENTIAL calculation is extremely slow, as shown here:

 Initial plane wave basis and FFT box
 ---------------------------------------------------------
 DONE(0.260526     SEC) : INIT PLANEWAVE
 DONE(30.8853     SEC) : LOCAL POTENTIAL

Additionally, it takes a long time for files to be output after SCF convergence is achieved. I'm also including the TIME STATISTICS from the stdout:

----------------------------------------------------------------------
      CLASS_NAME             NAME         TIME/s  CALLS   AVG/s  PER/%  
----------------------------------------------------------------------
                     total              214.96 13        16.54  100.00 
 Driver              atomic_world       214.96 1         214.96 100.00 
 ESolver_KS_LCAO     before_all_runners 33.02  1         33.02  15.36  
 NOrbital_Lm         extra_uniform      6.49   1875      0.00   3.02   
 Mathzone_Add1       Uni_Deriv_Phi      6.33   1875      0.00   2.94   
 Exx_LRI             init               32.38  1         32.38  15.06  
 Matrix_Orbs21       init               6.05   2         3.03   2.82   
 Matrix_Orbs21       init_radial_table  18.78  2         9.39   8.74   
 Center2_Orb         cal_ST_Phi12_R     15.67  3439      0.00   7.29   
 LRI_CV              set_orbitals       19.27  1         19.27  8.96   
 Matrix_Orbs11       init_radial_table  5.77   1         5.77   2.69   
 Ions                opt_ions           181.84 1         181.84 84.59  
 ESolver_KS          runner             128.16 1         128.16 59.62  
 ESolver_KS_LCAO     before_scf         3.01   1         3.01   1.40   
 Exx_LRI             cal_exx_ions       2.95   1         2.95   1.37   
 Potential           cal_veff           2.22   35        0.06   1.03   
 PotXC               cal_veff           2.19   35        0.06   1.02   
 XC_Functional       v_xc               52.51  22        2.39   24.43  
 HSolverLCAO         solve              7.82   34        0.23   3.64   
 HamiltLCAO          updateHk           2.31   3536      0.00   1.07   
 HSolverLCAO         hamiltSolvePsiK    3.87   3536      0.00   1.80   
 DiagoElpa           elpa_solve         3.10   3536      0.00   1.44   
 RI_2D_Comm          split_m2D_ktoR     100.91 7         14.42  46.94  
 Exx_LRI             cal_exx_elec       12.43  7         1.78   5.78   
 XC_Functional_Libxc v_xc_libxc         2.16   27        0.08   1.01   
 ESolver_KS_LCAO     cal_force          53.68  1         53.68  24.97  
 Force_Stress_LCAO   getForceStress     53.68  1         53.68  24.97  
 Exx_LRI             cal_exx_force      17.47  1         17.47  8.13   
 Exx_LRI             cal_exx_stress     36.11  1         36.11  16.80  
----------------------------------------------------------------------

My execution environment is as follows:
ABACUS version: v3.9.0.7
Compilation: Dockerfile.intel with intel-oneapi-mkl set to 2025.1
Command: OMP_NUM_THREADS=16 mpirun -np 2 abacus
CPU: 32 cores of Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz

Is this kind of speed expected? Do you have any advice?

Task list for Issue attackers (only for developers)

  • Reproduce the performance issue on a similar system or environment.
  • Identify the specific section of the code causing the performance issue.
  • Investigate the issue and determine the root cause.
  • Research best practices and potential solutions for the identified performance issue.
  • Implement the chosen solution to address the performance issue.
  • Test the implemented solution to ensure it improves performance without introducing new issues.
  • Optimize the solution if necessary, considering trade-offs between performance and other factors (e.g., code complexity, readability, maintainability).
  • Review and incorporate any relevant feedback from users or developers.
  • Merge the improved solution into the main codebase and notify the issue reporter.

Metadata

Metadata

Assignees

Labels

EXX and lr-TDDFTRelated to EXX or lr-TDDFTPerformanceIssues related to fail running ABACUS

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions