The efficiency of FFT with OpenMP actually decreases in multi-core scenario

### Details

While using OMP=12 with mpirun -n1 abacus to test Si16, theoretically, the efficiency of the FFT should be 12 times faster than a single core. However, in practice, setting OMP=12 is much slower than using a single core. Moreover, it consumes 80% of the total runtime, indicating that it is a bottleneck in the multi-core plane wave (pw) component.Here is the time table of the pw

![Image](https://github.com/user-attachments/assets/fa63cae7-ae55-40c9-952c-983ede41068b)

Here is the Input file of the test

`
INPUT_PARAMETERS
#Parameters (1.General)
suffix			autotest
calculation     scf

#nbands			8
symmetry		1

#Parameters (2.Iteration)
ecutwfc			60
scf_thr				1e-8
scf_nmax			100
cal_force 1
cal_stress 1
#Parameters (3.Basis)
basis_type		pw

#Parameters (4.Smearing)
smearing_method		gauss
smearing_sigma			0.002

#Parameters (5.Mixing)
mixing_type		broyden
mixing_beta		0.3
ks_solver dav
`


### Task list for Issue attackers (only for developers)

- [x] Reproduce the performance issue on a similar system or environment.
- [x] Identify the specific section of the code causing the performance issue.
- [ ] Investigate the issue and determine the root cause.
- [ ] Research best practices and potential solutions for the identified performance issue.
- [ ] Implement the chosen solution to address the performance issue.
- [ ] Test the implemented solution to ensure it improves performance without introducing new issues.
- [ ] Optimize the solution if necessary, considering trade-offs between performance and other factors (e.g., code complexity, readability, maintainability).
- [ ] Review and incorporate any relevant feedback from users or developers.
- [x] Merge the improved solution into the main codebase and notify the issue reporter.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The efficiency of FFT with OpenMP actually decreases in multi-core scenario #5986

Details

Task list for Issue attackers (only for developers)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The efficiency of FFT with OpenMP actually decreases in multi-core scenario #5986

Description

Details

Task list for Issue attackers (only for developers)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions