-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RegCM first benchmarks ESiWACE2 #16
Comments
@goord : The behavior is typical of this class of finite difference stenciled atmospheric codes on DM machines. At some point the communication overhead is greater than the computational work: we increase the number of communications while reducing the data payload and we decrease the amount of computations per core. |
Hi @graziano-giuliani thanks for the feedback. If the med22 case is representative (in terms of physics and transport schemes etc.) for higher-resolution test cases, then I propose we profile the application on 1 or 2 nodes to make sure we're not communication-bound |
Hi, I am opening this issue to discuss my preliminary findings regarding the med22 benchmark case (164 x 288 grid points, 23 vertical levels). I have run a preliminary benchmark on the Dutch national supercomputer Snellius using both gnu and intel compilers. The system contains of dual-socket nodes with 64 cores per CPU, and I observe that this case reaches about the maximum performance (11 simulated years per day) on 4 nodes.
@graziano-giuliani is this in line with your own performance numbers or am I missing crucial flags/options? BTW the intel MPI seems to do a slightly worse job here...
The text was updated successfully, but these errors were encountered: