diff --git a/.github/release_steps.md b/.github/release_steps.md index b08c8e892a6..4ecd05d0433 100644 --- a/.github/release_steps.md +++ b/.github/release_steps.md @@ -110,7 +110,7 @@ git clone -b release --depth 10 https://github.com/lammps/lammps.git lammps-rele cmake -S lammps-release/cmake -B build-release -G Ninja -D CMAKE_INSTALL_PREFIX=$PWD/lammps-static -D CMAKE_TOOLCHAIN_FILE=/usr/musl/share/cmake/linux-musl.cmake -C lammps-release/cmake/presets/most.cmake -C lammps-release/cmake/presets/kokkos-openmp.cmake -D DOWNLOAD_POTENTIALS=OFF -D BUILD_MPI=OFF -D BUILD_TESTING=OFF -D CMAKE_BUILD_TYPE=Release -D PKG_ATC=ON -D PKG_AWPMD=ON -D PKG_MANIFOLD=ON -D PKG_MESONT=ON -D PKG_MGPT=ON -D PKG_ML-PACE=ON -D PKG_ML-RANN=ON -D PKG_MOLFILE=ON -D PKG_PTM=ON -D PKG_QTB=ON -D PKG_SMTBQ=ON cmake --build build-release --target all cmake --build build-release --target install -/usr/musl/bin/x86_64-linux-musl-strip lammps-static/bin/* +/usr/musl/bin/x86_64-linux-musl-strip -g lammps-static/bin/* tar -czvvf ../lammps-linux-x86_64-4Feb2025.tar.gz lammps-static exit # fedora 41 container cd .. diff --git a/.github/workflows/check-cpp23.yml b/.github/workflows/check-cpp23.yml index 15b16e71e40..9a634b951f2 100644 --- a/.github/workflows/check-cpp23.yml +++ b/.github/workflows/check-cpp23.yml @@ -55,8 +55,8 @@ jobs: uses: actions/cache@v4 with: path: ${{ env.CCACHE_DIR }} - key: linux-cpp23-ccache-${{ github.sha }} - restore-keys: linux-cpp23-ccache- + key: linux-cpp23-ccache-${{ matrix.idx }}-${{ github.sha }} + restore-keys: linux-cpp23-ccache-${{ matrix.idx }} - name: Building LAMMPS via CMake shell: bash diff --git a/bench/log.15Jul25.chain.fixed.g++.1 b/bench/log.15Jul25.chain.fixed.g++.1 new file mode 100644 index 00000000000..e7abdcb14fc --- /dev/null +++ b/bench/log.15Jul25.chain.fixed.g++.1 @@ -0,0 +1,95 @@ +LAMMPS (12 Jun 2025 - Development - patch_12Jun2025-808-g67067cbc80) + using 1 OpenMP thread(s) per MPI task +# FENE beadspring benchmark + +units lj +atom_style bond +special_bonds fene + +read_data data.chain +Reading data file ... + orthogonal box = (-16.796 -16.796 -16.796) to (16.796 16.796 16.796) + 1 by 1 by 1 MPI processor grid + reading atoms ... + 32000 atoms + reading velocities ... + 32000 velocities + scanning bonds ... + 1 = max bonds/atom + orthogonal box = (-16.796 -16.796 -16.796) to (16.796 16.796 16.796) + 1 by 1 by 1 MPI processor grid + reading bonds ... + 31680 bonds +Finding 1-2 1-3 1-4 neighbors ... + special bond factors lj: 0 1 1 + special bond factors coul: 0 1 1 + 2 = max # of 1-2 neighbors + 2 = max # of special neighbors + special bonds CPU = 0.002 seconds + read_data CPU = 0.098 seconds + +neighbor 0.4 bin +neigh_modify every 1 delay 1 + +bond_style fene +bond_coeff 1 30.0 1.5 1.0 1.0 + +pair_style lj/cut 1.12 +pair_modify shift yes +pair_coeff 1 1 1.0 1.0 1.12 + +fix 1 all nve +fix 2 all langevin 1.0 1.0 10.0 904297 + +thermo 100 +timestep 0.012 + +run 100 +Generated 0 of 0 mixed pair_coeff terms from geometric mixing rule +WARNING: Communication cutoff 1.52 is shorter than a bond length based estimate of 1.855. This may lead to errors. (src/comm.cpp:743) +Neighbor list info ... + update: every = 1 steps, delay = 1 steps, check = yes + max neighbors/atom: 2000, page size: 100000 + master list distance cutoff = 1.52 + ghost atom cutoff = 1.52 + binsize = 0.76, bins = 45 45 45 + 1 neighbor lists, perpetual/occasional/extra = 1 0 0 + (1) pair lj/cut, perpetual + attributes: half, newton on + pair build: half/bin/newton + stencil: half/bin/3d + bin: standard +WARNING: Communication cutoff 1.52 is shorter than a bond length based estimate of 1.855. This may lead to errors. (src/comm.cpp:743) +Per MPI rank memory allocation (min/avg/max) = 13.2 | 13.2 | 13.2 Mbytes + Step Temp E_pair E_mol TotEng Press + 0 0.97029772 0.44484087 20.494523 22.394765 4.6721833 + 100 0.9729966 0.4361122 20.507698 22.40326 4.6548819 +Loop time of 0.531 on 1 procs for 100 steps with 32000 atoms + +Performance: 195254.376 tau/day, 188.324 timesteps/s, 6.026 Matom-step/s +99.6% CPU use with 1 MPI tasks x 1 OpenMP threads + +MPI task timing breakdown: +Section | min time | avg time | max time |%varavg| %total +--------------------------------------------------------------- +Pair | 0.092788 | 0.092788 | 0.092788 | 0.0 | 17.47 +Bond | 0.021754 | 0.021754 | 0.021754 | 0.0 | 4.10 +Neigh | 0.27771 | 0.27771 | 0.27771 | 0.0 | 52.30 +Comm | 0.0088421 | 0.0088421 | 0.0088421 | 0.0 | 1.67 +Output | 9.1455e-05 | 9.1455e-05 | 9.1455e-05 | 0.0 | 0.02 +Modify | 0.12461 | 0.12461 | 0.12461 | 0.0 | 23.47 +Other | | 0.005205 | | | 0.98 + +Nlocal: 32000 ave 32000 max 32000 min +Histogram: 1 0 0 0 0 0 0 0 0 0 +Nghost: 9493 ave 9493 max 9493 min +Histogram: 1 0 0 0 0 0 0 0 0 0 +Neighs: 155873 ave 155873 max 155873 min +Histogram: 1 0 0 0 0 0 0 0 0 0 + +Total # of neighbors = 155873 +Ave neighs/atom = 4.8710312 +Ave special neighs/atom = 1.98 +Neighbor list builds = 25 +Dangerous builds = 0 +Total wall time: 0:00:00 diff --git a/bench/log.15Jul25.chain.fixed.g++.4 b/bench/log.15Jul25.chain.fixed.g++.4 new file mode 100644 index 00000000000..f412a6c8830 --- /dev/null +++ b/bench/log.15Jul25.chain.fixed.g++.4 @@ -0,0 +1,95 @@ +LAMMPS (12 Jun 2025 - Development - patch_12Jun2025-808-g67067cbc80) + using 1 OpenMP thread(s) per MPI task +# FENE beadspring benchmark + +units lj +atom_style bond +special_bonds fene + +read_data data.chain +Reading data file ... + orthogonal box = (-16.796 -16.796 -16.796) to (16.796 16.796 16.796) + 1 by 2 by 2 MPI processor grid + reading atoms ... + 32000 atoms + reading velocities ... + 32000 velocities + scanning bonds ... + 1 = max bonds/atom + orthogonal box = (-16.796 -16.796 -16.796) to (16.796 16.796 16.796) + 1 by 2 by 2 MPI processor grid + reading bonds ... + 31680 bonds +Finding 1-2 1-3 1-4 neighbors ... + special bond factors lj: 0 1 1 + special bond factors coul: 0 1 1 + 2 = max # of 1-2 neighbors + 2 = max # of special neighbors + special bonds CPU = 0.001 seconds + read_data CPU = 0.081 seconds + +neighbor 0.4 bin +neigh_modify every 1 delay 1 + +bond_style fene +bond_coeff 1 30.0 1.5 1.0 1.0 + +pair_style lj/cut 1.12 +pair_modify shift yes +pair_coeff 1 1 1.0 1.0 1.12 + +fix 1 all nve +fix 2 all langevin 1.0 1.0 10.0 904297 + +thermo 100 +timestep 0.012 + +run 100 +Generated 0 of 0 mixed pair_coeff terms from geometric mixing rule +WARNING: Communication cutoff 1.52 is shorter than a bond length based estimate of 1.855. This may lead to errors. (src/comm.cpp:743) +Neighbor list info ... + update: every = 1 steps, delay = 1 steps, check = yes + max neighbors/atom: 2000, page size: 100000 + master list distance cutoff = 1.52 + ghost atom cutoff = 1.52 + binsize = 0.76, bins = 45 45 45 + 1 neighbor lists, perpetual/occasional/extra = 1 0 0 + (1) pair lj/cut, perpetual + attributes: half, newton on + pair build: half/bin/newton + stencil: half/bin/3d + bin: standard +WARNING: Communication cutoff 1.52 is shorter than a bond length based estimate of 1.855. This may lead to errors. (src/comm.cpp:743) +Per MPI rank memory allocation (min/avg/max) = 4.779 | 4.78 | 4.78 Mbytes + Step Temp E_pair E_mol TotEng Press + 0 0.97029772 0.44484087 20.494523 22.394765 4.6721833 + 100 0.97145835 0.43803883 20.502691 22.397872 4.626988 +Loop time of 0.141838 on 4 procs for 100 steps with 32000 atoms + +Performance: 730973.705 tau/day, 705.029 timesteps/s, 22.561 Matom-step/s +99.6% CPU use with 4 MPI tasks x 1 OpenMP threads + +MPI task timing breakdown: +Section | min time | avg time | max time |%varavg| %total +--------------------------------------------------------------- +Pair | 0.023828 | 0.023917 | 0.024042 | 0.1 | 16.86 +Bond | 0.0054718 | 0.0055341 | 0.0056027 | 0.1 | 3.90 +Neigh | 0.072672 | 0.07269 | 0.072704 | 0.0 | 51.25 +Comm | 0.0057901 | 0.00599 | 0.006123 | 0.2 | 4.22 +Output | 3.2532e-05 | 4.1552e-05 | 5.0617e-05 | 0.0 | 0.03 +Modify | 0.031418 | 0.031534 | 0.031657 | 0.0 | 22.23 +Other | | 0.002133 | | | 1.50 + +Nlocal: 8000 ave 8030 max 7974 min +Histogram: 1 0 0 1 0 1 0 0 0 1 +Nghost: 4177 ave 4191 max 4160 min +Histogram: 1 0 0 0 1 0 0 1 0 1 +Neighs: 38995.8 ave 39169 max 38852 min +Histogram: 1 0 0 1 1 0 0 0 0 1 + +Total # of neighbors = 155983 +Ave neighs/atom = 4.8744688 +Ave special neighs/atom = 1.98 +Neighbor list builds = 25 +Dangerous builds = 0 +Total wall time: 0:00:00 diff --git a/bench/log.15Jul25.chain.scaled.g++.4 b/bench/log.15Jul25.chain.scaled.g++.4 new file mode 100644 index 00000000000..f465efa7f87 --- /dev/null +++ b/bench/log.15Jul25.chain.scaled.g++.4 @@ -0,0 +1,95 @@ +LAMMPS (12 Jun 2025 - Development - patch_12Jun2025-784-g8c564460e6-modified) + using 1 OpenMP thread(s) per MPI task +# FENE beadspring benchmark + +units lj +atom_style bond +special_bonds fene + +read_data data.chain +Reading data file ... + orthogonal box = (-16.796 -16.796 -16.796) to (16.796 16.796 16.796) + 1 by 2 by 2 MPI processor grid + reading atoms ... + 32000 atoms + reading velocities ... + 32000 velocities + scanning bonds ... + 1 = max bonds/atom + orthogonal box = (-16.796 -16.796 -16.796) to (16.796 16.796 16.796) + 1 by 2 by 2 MPI processor grid + reading bonds ... + 31680 bonds +Finding 1-2 1-3 1-4 neighbors ... + special bond factors lj: 0 1 1 + special bond factors coul: 0 1 1 + 2 = max # of 1-2 neighbors + 2 = max # of special neighbors + special bonds CPU = 0.001 seconds + read_data CPU = 0.085 seconds + +neighbor 0.4 bin +neigh_modify every 1 delay 1 + +bond_style fene +bond_coeff 1 30.0 1.5 1.0 1.0 + +pair_style lj/cut 1.12 +pair_modify shift yes +pair_coeff 1 1 1.0 1.0 1.12 + +fix 1 all nve +fix 2 all langevin 1.0 1.0 10.0 904297 + +thermo 100 +timestep 0.012 + +run 100 +Generated 0 of 0 mixed pair_coeff terms from geometric mixing rule +WARNING: Communication cutoff 1.52 is shorter than a bond length based estimate of 1.855. This may lead to errors. (src/comm.cpp:743) +Neighbor list info ... + update: every = 1 steps, delay = 1 steps, check = yes + max neighbors/atom: 2000, page size: 100000 + master list distance cutoff = 1.52 + ghost atom cutoff = 1.52 + binsize = 0.76, bins = 45 45 45 + 1 neighbor lists, perpetual/occasional/extra = 1 0 0 + (1) pair lj/cut, perpetual + attributes: half, newton on + pair build: half/bin/newton + stencil: half/bin/3d + bin: standard +WARNING: Communication cutoff 1.52 is shorter than a bond length based estimate of 1.855. This may lead to errors. (src/comm.cpp:743) +Per MPI rank memory allocation (min/avg/max) = 4.779 | 4.78 | 4.78 Mbytes + Step Temp E_pair E_mol TotEng Press + 0 0.97029772 0.44484087 20.494523 22.394765 4.6721833 + 100 0.97145835 0.43803883 20.502691 22.397872 4.626988 +Loop time of 0.14297 on 4 procs for 100 steps with 32000 atoms + +Performance: 725185.191 tau/day, 699.446 timesteps/s, 22.382 Matom-step/s +99.5% CPU use with 4 MPI tasks x 1 OpenMP threads + +MPI task timing breakdown: +Section | min time | avg time | max time |%varavg| %total +--------------------------------------------------------------- +Pair | 0.023853 | 0.023965 | 0.02413 | 0.1 | 16.76 +Bond | 0.0054567 | 0.0055557 | 0.0056918 | 0.1 | 3.89 +Neigh | 0.073792 | 0.073802 | 0.073811 | 0.0 | 51.62 +Comm | 0.0055967 | 0.0059012 | 0.0060867 | 0.3 | 4.13 +Output | 3.2282e-05 | 3.7126e-05 | 4.2381e-05 | 0.0 | 0.03 +Modify | 0.031379 | 0.03148 | 0.031647 | 0.1 | 22.02 +Other | | 0.002229 | | | 1.56 + +Nlocal: 8000 ave 8030 max 7974 min +Histogram: 1 0 0 1 0 1 0 0 0 1 +Nghost: 4177 ave 4191 max 4160 min +Histogram: 1 0 0 0 1 0 0 1 0 1 +Neighs: 38995.8 ave 39169 max 38852 min +Histogram: 1 0 0 1 1 0 0 0 0 1 + +Total # of neighbors = 155983 +Ave neighs/atom = 4.8744688 +Ave special neighs/atom = 1.98 +Neighbor list builds = 25 +Dangerous builds = 0 +Total wall time: 0:00:00 diff --git a/bench/log.15Jul25.chute.fixed.g++.1 b/bench/log.15Jul25.chute.fixed.g++.1 new file mode 100644 index 00000000000..621d4624e12 --- /dev/null +++ b/bench/log.15Jul25.chute.fixed.g++.1 @@ -0,0 +1,89 @@ +LAMMPS (12 Jun 2025 - Development - patch_12Jun2025-808-g67067cbc80) + using 1 OpenMP thread(s) per MPI task +# LAMMPS benchmark of granular flow +# chute flow of 32000 atoms with frozen base at 26 degrees + +units lj +atom_style sphere +boundary p p fs +newton off +comm_modify vel yes + +read_data data.chute +Reading data file ... + orthogonal box = (0 0 0) to (40 20 37.2886) + 1 by 1 by 1 MPI processor grid + reading atoms ... + 32000 atoms + reading velocities ... + 32000 velocities + read_data CPU = 0.102 seconds + +pair_style gran/hooke/history 200000.0 NULL 50.0 NULL 0.5 0 +pair_coeff * * + +neighbor 0.1 bin +neigh_modify every 1 delay 0 + +timestep 0.0001 + +group bottom type 2 +912 atoms in group bottom +group active subtract all bottom +31088 atoms in group active +neigh_modify exclude group bottom bottom + +fix 1 all gravity 1.0 chute 26.0 +fix 2 bottom freeze +fix 3 active nve/sphere + +compute 1 all erotate/sphere +thermo_style custom step atoms ke c_1 vol +thermo_modify norm no +thermo 100 + +run 100 +Generated 0 of 1 mixed pair_coeff terms from geometric mixing rule +Neighbor list info ... + update: every = 1 steps, delay = 0 steps, check = yes + max neighbors/atom: 2000, page size: 100000 + master list distance cutoff = 1.1 + ghost atom cutoff = 1.1 + binsize = 0.55, bins = 73 37 68 + 1 neighbor lists, perpetual/occasional/extra = 1 0 0 + (1) pair gran/hooke/history, perpetual + attributes: half, newton off, size, history + pair build: half/size/bin/atomonly/newtoff + stencil: full/bin/3d + bin: standard +Per MPI rank memory allocation (min/avg/max) = 23.37 | 23.37 | 23.37 Mbytes + Step Atoms KinEng c_1 Volume + 0 32000 784139.13 1601.1263 29833.783 + 100 32000 784292.08 1571.0968 29834.707 +Loop time of 0.155391 on 1 procs for 100 steps with 32000 atoms + +Performance: 5560.184 tau/day, 643.540 timesteps/s, 20.593 Matom-step/s +99.6% CPU use with 1 MPI tasks x 1 OpenMP threads + +MPI task timing breakdown: +Section | min time | avg time | max time |%varavg| %total +--------------------------------------------------------------- +Pair | 0.099082 | 0.099082 | 0.099082 | 0.0 | 63.76 +Neigh | 0.013743 | 0.013743 | 0.013743 | 0.0 | 8.84 +Comm | 0.0042572 | 0.0042572 | 0.0042572 | 0.0 | 2.74 +Output | 0.00017358 | 0.00017358 | 0.00017358 | 0.0 | 0.11 +Modify | 0.033446 | 0.033446 | 0.033446 | 0.0 | 21.52 +Other | | 0.00469 | | | 3.02 + +Nlocal: 32000 ave 32000 max 32000 min +Histogram: 1 0 0 0 0 0 0 0 0 0 +Nghost: 5463 ave 5463 max 5463 min +Histogram: 1 0 0 0 0 0 0 0 0 0 +Neighs: 115133 ave 115133 max 115133 min +Histogram: 1 0 0 0 0 0 0 0 0 0 + +Total # of neighbors = 115133 +Ave neighs/atom = 3.5979062 +Neighbor list builds = 2 +Dangerous builds = 0 +Total wall time: 0:00:00 diff --git a/bench/log.15Jul25.chute.fixed.g++.4 b/bench/log.15Jul25.chute.fixed.g++.4 new file mode 100644 index 00000000000..b3295c8bb17 --- /dev/null +++ b/bench/log.15Jul25.chute.fixed.g++.4 @@ -0,0 +1,89 @@ +LAMMPS (12 Jun 2025 - Development - patch_12Jun2025-808-g67067cbc80) + using 1 OpenMP thread(s) per MPI task +# LAMMPS benchmark of granular flow +# chute flow of 32000 atoms with frozen base at 26 degrees + +units lj +atom_style sphere +boundary p p fs +newton off +comm_modify vel yes + +read_data data.chute +Reading data file ... + orthogonal box = (0 0 0) to (40 20 37.2886) + 2 by 1 by 2 MPI processor grid + reading atoms ... + 32000 atoms + reading velocities ... + 32000 velocities + read_data CPU = 0.071 seconds + +pair_style gran/hooke/history 200000.0 NULL 50.0 NULL 0.5 0 +pair_coeff * * + +neighbor 0.1 bin +neigh_modify every 1 delay 0 + +timestep 0.0001 + +group bottom type 2 +912 atoms in group bottom +group active subtract all bottom +31088 atoms in group active +neigh_modify exclude group bottom bottom + +fix 1 all gravity 1.0 chute 26.0 +fix 2 bottom freeze +fix 3 active nve/sphere + +compute 1 all erotate/sphere +thermo_style custom step atoms ke c_1 vol +thermo_modify norm no +thermo 100 + +run 100 +Generated 0 of 1 mixed pair_coeff terms from geometric mixing rule +Neighbor list info ... + update: every = 1 steps, delay = 0 steps, check = yes + max neighbors/atom: 2000, page size: 100000 + master list distance cutoff = 1.1 + ghost atom cutoff = 1.1 + binsize = 0.55, bins = 73 37 68 + 1 neighbor lists, perpetual/occasional/extra = 1 0 0 + (1) pair gran/hooke/history, perpetual + attributes: half, newton off, size, history + pair build: half/size/bin/atomonly/newtoff + stencil: full/bin/3d + bin: standard +Per MPI rank memory allocation (min/avg/max) = 10.59 | 10.59 | 10.6 Mbytes + Step Atoms KinEng c_1 Volume + 0 32000 784139.13 1601.1263 29833.783 + 100 32000 784292.08 1571.0968 29834.707 +Loop time of 0.0451259 on 4 procs for 100 steps with 32000 atoms + +Performance: 19146.451 tau/day, 2216.024 timesteps/s, 70.913 Matom-step/s +99.1% CPU use with 4 MPI tasks x 1 OpenMP threads + +MPI task timing breakdown: +Section | min time | avg time | max time |%varavg| %total +--------------------------------------------------------------- +Pair | 0.02312 | 0.023875 | 0.024935 | 0.5 | 52.91 +Neigh | 0.0047685 | 0.0048664 | 0.0049412 | 0.1 | 10.78 +Comm | 0.0039881 | 0.0041343 | 0.0042858 | 0.2 | 9.16 +Output | 5.0517e-05 | 6.8193e-05 | 8.5122e-05 | 0.0 | 0.15 +Modify | 0.0088343 | 0.0089082 | 0.0089893 | 0.1 | 19.74 +Other | | 0.003274 | | | 7.26 + +Nlocal: 8000 ave 8008 max 7992 min +Histogram: 2 0 0 0 0 0 0 0 0 2 +Nghost: 2439 ave 2450 max 2428 min +Histogram: 2 0 0 0 0 0 0 0 0 2 +Neighs: 29500.5 ave 30488 max 28513 min +Histogram: 2 0 0 0 0 0 0 0 0 2 + +Total # of neighbors = 118002 +Ave neighs/atom = 3.6875625 +Neighbor list builds = 2 +Dangerous builds = 0 +Total wall time: 0:00:00 diff --git a/bench/log.15Jul25.chute.scaled.g++.4 b/bench/log.15Jul25.chute.scaled.g++.4 new file mode 100644 index 00000000000..cb3472af18e --- /dev/null +++ b/bench/log.15Jul25.chute.scaled.g++.4 @@ -0,0 +1,89 @@ +LAMMPS (12 Jun 2025 - Development - patch_12Jun2025-784-g8c564460e6-modified) + using 1 OpenMP thread(s) per MPI task +# LAMMPS benchmark of granular flow +# chute flow of 32000 atoms with frozen base at 26 degrees + +units lj +atom_style sphere +boundary p p fs +newton off +comm_modify vel yes + +read_data data.chute +Reading data file ... + orthogonal box = (0 0 0) to (40 20 37.2886) + 2 by 1 by 2 MPI processor grid + reading atoms ... + 32000 atoms + reading velocities ... + 32000 velocities + read_data CPU = 0.076 seconds + +pair_style gran/hooke/history 200000.0 NULL 50.0 NULL 0.5 0 +pair_coeff * * + +neighbor 0.1 bin +neigh_modify every 1 delay 0 + +timestep 0.0001 + +group bottom type 2 +912 atoms in group bottom +group active subtract all bottom +31088 atoms in group active +neigh_modify exclude group bottom bottom + +fix 1 all gravity 1.0 chute 26.0 +fix 2 bottom freeze +fix 3 active nve/sphere + +compute 1 all erotate/sphere +thermo_style custom step atoms ke c_1 vol +thermo_modify norm no +thermo 100 + +run 100 +Generated 0 of 1 mixed pair_coeff terms from geometric mixing rule +Neighbor list info ... + update: every = 1 steps, delay = 0 steps, check = yes + max neighbors/atom: 2000, page size: 100000 + master list distance cutoff = 1.1 + ghost atom cutoff = 1.1 + binsize = 0.55, bins = 73 37 68 + 1 neighbor lists, perpetual/occasional/extra = 1 0 0 + (1) pair gran/hooke/history, perpetual + attributes: half, newton off, size, history + pair build: half/size/bin/atomonly/newtoff + stencil: full/bin/3d + bin: standard +Per MPI rank memory allocation (min/avg/max) = 10.59 | 10.59 | 10.6 Mbytes + Step Atoms KinEng c_1 Volume + 0 32000 784139.13 1601.1263 29833.783 + 100 32000 784292.08 1571.0968 29834.707 +Loop time of 0.0434391 on 4 procs for 100 steps with 32000 atoms + +Performance: 19889.900 tau/day, 2302.072 timesteps/s, 73.666 Matom-step/s +99.2% CPU use with 4 MPI tasks x 1 OpenMP threads + +MPI task timing breakdown: +Section | min time | avg time | max time |%varavg| %total +--------------------------------------------------------------- +Pair | 0.021299 | 0.022774 | 0.024215 | 0.8 | 52.43 +Neigh | 0.0046727 | 0.0047926 | 0.0048429 | 0.1 | 11.03 +Comm | 0.0037461 | 0.003889 | 0.0040378 | 0.2 | 8.95 +Output | 5.0456e-05 | 7.1489e-05 | 9.1084e-05 | 0.0 | 0.16 +Modify | 0.0087177 | 0.0087334 | 0.008752 | 0.0 | 20.10 +Other | | 0.003179 | | | 7.32 + +Nlocal: 8000 ave 8008 max 7992 min +Histogram: 2 0 0 0 0 0 0 0 0 2 +Nghost: 2439 ave 2450 max 2428 min +Histogram: 2 0 0 0 0 0 0 0 0 2 +Neighs: 29500.5 ave 30488 max 28513 min +Histogram: 2 0 0 0 0 0 0 0 0 2 + +Total # of neighbors = 118002 +Ave neighs/atom = 3.6875625 +Neighbor list builds = 2 +Dangerous builds = 0 +Total wall time: 0:00:00 diff --git a/bench/log.15Jul25.eam.fixed.g++.1 b/bench/log.15Jul25.eam.fixed.g++.1 new file mode 100644 index 00000000000..db601ecfeb3 --- /dev/null +++ b/bench/log.15Jul25.eam.fixed.g++.1 @@ -0,0 +1,91 @@ +LAMMPS (12 Jun 2025 - Development - patch_12Jun2025-808-g67067cbc80) + using 1 OpenMP thread(s) per MPI task +# bulk Cu lattice + +variable x index 1 +variable y index 1 +variable z index 1 + +variable xx equal 20*$x +variable xx equal 20*1 +variable yy equal 20*$y +variable yy equal 20*1 +variable zz equal 20*$z +variable zz equal 20*1 + +units metal +atom_style atomic + +lattice fcc 3.615 +Lattice spacing in x,y,z = 3.615 3.615 3.615 +region box block 0 ${xx} 0 ${yy} 0 ${zz} +region box block 0 20 0 ${yy} 0 ${zz} +region box block 0 20 0 20 0 ${zz} +region box block 0 20 0 20 0 20 +create_box 1 box +Created orthogonal box = (0 0 0) to (72.3 72.3 72.3) + 1 by 1 by 1 MPI processor grid +create_atoms 1 box +Created 32000 atoms + using lattice units in orthogonal box = (0 0 0) to (72.3 72.3 72.3) + create_atoms CPU = 0.004 seconds + +pair_style eam +pair_coeff 1 1 Cu_u3.eam +Reading eam potential file Cu_u3.eam with DATE: 2007-06-11 + +velocity all create 1600.0 376847 loop geom + +neighbor 1.0 bin +neigh_modify every 1 delay 5 check yes + +fix 1 all nve + +timestep 0.005 +thermo 50 + +run 100 +Neighbor list info ... + update: every = 1 steps, delay = 5 steps, check = yes + max neighbors/atom: 2000, page size: 100000 + master list distance cutoff = 5.95 + ghost atom cutoff = 5.95 + binsize = 2.975, bins = 25 25 25 + 1 neighbor lists, perpetual/occasional/extra = 1 0 0 + (1) pair eam, perpetual + attributes: half, newton on + pair build: half/bin/atomonly/newton + stencil: half/bin/3d + bin: standard +Per MPI rank memory allocation (min/avg/max) = 16.83 | 16.83 | 16.83 Mbytes + Step Temp E_pair E_mol TotEng Press + 0 1600 -113280 0 -106662.09 18703.573 + 50 781.69049 -109873.35 0 -106640.13 52273.088 + 100 801.832 -109957.3 0 -106640.77 51322.821 +Loop time of 2.34899 on 1 procs for 100 steps with 32000 atoms + +Performance: 18.391 ns/day, 1.305 hours/ns, 42.572 timesteps/s, 1.362 Matom-step/s +99.6% CPU use with 1 MPI tasks x 1 OpenMP threads + +MPI task timing breakdown: +Section | min time | avg time | max time |%varavg| %total +--------------------------------------------------------------- +Pair | 2.0045 | 2.0045 | 2.0045 | 0.0 | 85.33 +Neigh | 0.3191 | 0.3191 | 0.3191 | 0.0 | 13.58 +Comm | 0.0084135 | 0.0084135 | 0.0084135 | 0.0 | 0.36 +Output | 0.00019136 | 0.00019136 | 0.00019136 | 0.0 | 0.01 +Modify | 0.012899 | 0.012899 | 0.012899 | 0.0 | 0.55 +Other | | 0.003925 | | | 0.17 + +Nlocal: 32000 ave 32000 max 32000 min +Histogram: 1 0 0 0 0 0 0 0 0 0 +Nghost: 19909 ave 19909 max 19909 min +Histogram: 1 0 0 0 0 0 0 0 0 0 +Neighs: 1.20778e+06 ave 1.20778e+06 max 1.20778e+06 min +Histogram: 1 0 0 0 0 0 0 0 0 0 + +Total # of neighbors = 1207784 +Ave neighs/atom = 37.74325 +Neighbor list builds = 13 +Dangerous builds = 0 +Total wall time: 0:00:02 diff --git a/bench/log.15Jul25.eam.fixed.g++.4 b/bench/log.15Jul25.eam.fixed.g++.4 new file mode 100644 index 00000000000..c408513747c --- /dev/null +++ b/bench/log.15Jul25.eam.fixed.g++.4 @@ -0,0 +1,91 @@ +LAMMPS (12 Jun 2025 - Development - patch_12Jun2025-808-g67067cbc80) + using 1 OpenMP thread(s) per MPI task +# bulk Cu lattice + +variable x index 1 +variable y index 1 +variable z index 1 + +variable xx equal 20*$x +variable xx equal 20*1 +variable yy equal 20*$y +variable yy equal 20*1 +variable zz equal 20*$z +variable zz equal 20*1 + +units metal +atom_style atomic + +lattice fcc 3.615 +Lattice spacing in x,y,z = 3.615 3.615 3.615 +region box block 0 ${xx} 0 ${yy} 0 ${zz} +region box block 0 20 0 ${yy} 0 ${zz} +region box block 0 20 0 20 0 ${zz} +region box block 0 20 0 20 0 20 +create_box 1 box +Created orthogonal box = (0 0 0) to (72.3 72.3 72.3) + 1 by 2 by 2 MPI processor grid +create_atoms 1 box +Created 32000 atoms + using lattice units in orthogonal box = (0 0 0) to (72.3 72.3 72.3) + create_atoms CPU = 0.001 seconds + +pair_style eam +pair_coeff 1 1 Cu_u3.eam +Reading eam potential file Cu_u3.eam with DATE: 2007-06-11 + +velocity all create 1600.0 376847 loop geom + +neighbor 1.0 bin +neigh_modify every 1 delay 5 check yes + +fix 1 all nve + +timestep 0.005 +thermo 50 + +run 100 +Neighbor list info ... + update: every = 1 steps, delay = 5 steps, check = yes + max neighbors/atom: 2000, page size: 100000 + master list distance cutoff = 5.95 + ghost atom cutoff = 5.95 + binsize = 2.975, bins = 25 25 25 + 1 neighbor lists, perpetual/occasional/extra = 1 0 0 + (1) pair eam, perpetual + attributes: half, newton on + pair build: half/bin/atomonly/newton + stencil: half/bin/3d + bin: standard +Per MPI rank memory allocation (min/avg/max) = 7.382 | 7.382 | 7.382 Mbytes + Step Temp E_pair E_mol TotEng Press + 0 1600 -113280 0 -106662.09 18703.573 + 50 781.69049 -109873.35 0 -106640.13 52273.088 + 100 801.832 -109957.3 0 -106640.77 51322.821 +Loop time of 0.632615 on 4 procs for 100 steps with 32000 atoms + +Performance: 68.288 ns/day, 0.351 hours/ns, 158.074 timesteps/s, 5.058 Matom-step/s +99.4% CPU use with 4 MPI tasks x 1 OpenMP threads + +MPI task timing breakdown: +Section | min time | avg time | max time |%varavg| %total +--------------------------------------------------------------- +Pair | 0.53032 | 0.53166 | 0.5323 | 0.1 | 84.04 +Neigh | 0.083688 | 0.083977 | 0.084342 | 0.1 | 13.27 +Comm | 0.010172 | 0.010735 | 0.011792 | 0.6 | 1.70 +Output | 6.7649e-05 | 7.0092e-05 | 7.305e-05 | 0.0 | 0.01 +Modify | 0.0040301 | 0.0041093 | 0.0042195 | 0.1 | 0.65 +Other | | 0.002059 | | | 0.33 + +Nlocal: 8000 ave 8008 max 7993 min +Histogram: 2 0 0 0 0 0 0 0 1 1 +Nghost: 9130.25 ave 9138 max 9122 min +Histogram: 2 0 0 0 0 0 0 0 0 2 +Neighs: 301946 ave 302392 max 301360 min +Histogram: 1 0 0 0 1 0 0 0 1 1 + +Total # of neighbors = 1207784 +Ave neighs/atom = 37.74325 +Neighbor list builds = 13 +Dangerous builds = 0 +Total wall time: 0:00:00 diff --git a/bench/log.15Jul25.eam.scaled.g++.4 b/bench/log.15Jul25.eam.scaled.g++.4 new file mode 100644 index 00000000000..f0862e055de --- /dev/null +++ b/bench/log.15Jul25.eam.scaled.g++.4 @@ -0,0 +1,91 @@ +LAMMPS (12 Jun 2025 - Development - patch_12Jun2025-784-g8c564460e6-modified) + using 1 OpenMP thread(s) per MPI task +# bulk Cu lattice + +variable x index 1 +variable y index 1 +variable z index 1 + +variable xx equal 20*$x +variable xx equal 20*2 +variable yy equal 20*$y +variable yy equal 20*2 +variable zz equal 20*$z +variable zz equal 20*1 + +units metal +atom_style atomic + +lattice fcc 3.615 +Lattice spacing in x,y,z = 3.615 3.615 3.615 +region box block 0 ${xx} 0 ${yy} 0 ${zz} +region box block 0 40 0 ${yy} 0 ${zz} +region box block 0 40 0 40 0 ${zz} +region box block 0 40 0 40 0 20 +create_box 1 box +Created orthogonal box = (0 0 0) to (144.6 144.6 72.3) + 2 by 2 by 1 MPI processor grid +create_atoms 1 box +Created 128000 atoms + using lattice units in orthogonal box = (0 0 0) to (144.6 144.6 72.3) + create_atoms CPU = 0.004 seconds + +pair_style eam +pair_coeff 1 1 Cu_u3.eam +Reading eam potential file Cu_u3.eam with DATE: 2007-06-11 + +velocity all create 1600.0 376847 loop geom + +neighbor 1.0 bin +neigh_modify every 1 delay 5 check yes + +fix 1 all nve + +timestep 0.005 +thermo 50 + +run 100 +Neighbor list info ... + update: every = 1 steps, delay = 5 steps, check = yes + max neighbors/atom: 2000, page size: 100000 + master list distance cutoff = 5.95 + ghost atom cutoff = 5.95 + binsize = 2.975, bins = 49 49 25 + 1 neighbor lists, perpetual/occasional/extra = 1 0 0 + (1) pair eam, perpetual + attributes: half, newton on + pair build: half/bin/atomonly/newton + stencil: half/bin/3d + bin: standard +Per MPI rank memory allocation (min/avg/max) = 17.13 | 17.13 | 17.13 Mbytes + Step Temp E_pair E_mol TotEng Press + 0 1600 -453120 0 -426647.73 18704.012 + 50 779.50001 -439457.02 0 -426560.06 52355.276 + 100 797.97828 -439764.76 0 -426562.07 51474.74 +Loop time of 2.6471 on 4 procs for 100 steps with 128000 atoms + +Performance: 16.320 ns/day, 1.471 hours/ns, 37.777 timesteps/s, 4.835 Matom-step/s +99.5% CPU use with 4 MPI tasks x 1 OpenMP threads + +MPI task timing breakdown: +Section | min time | avg time | max time |%varavg| %total +--------------------------------------------------------------- +Pair | 2.1797 | 2.1945 | 2.2029 | 0.6 | 82.90 +Neigh | 0.36186 | 0.36355 | 0.36453 | 0.2 | 13.73 +Comm | 0.038468 | 0.049112 | 0.066931 | 5.0 | 1.86 +Output | 0.00019847 | 0.00021211 | 0.00023473 | 0.0 | 0.01 +Modify | 0.027395 | 0.028328 | 0.029054 | 0.4 | 1.07 +Other | | 0.01136 | | | 0.43 + +Nlocal: 32000 ave 32092 max 31914 min +Histogram: 1 0 0 1 0 1 0 0 0 1 +Nghost: 19910 ave 19997 max 19818 min +Histogram: 1 0 0 0 1 0 1 0 0 1 +Neighs: 1.20728e+06 ave 1.21142e+06 max 1.2036e+06 min +Histogram: 1 0 0 1 1 0 0 0 0 1 + +Total # of neighbors = 4829126 +Ave neighs/atom = 37.727547 +Neighbor list builds = 14 +Dangerous builds = 0 +Total wall time: 0:00:02 diff --git a/bench/log.15Jul25.lj.fixed.g++.1 b/bench/log.15Jul25.lj.fixed.g++.1 new file mode 100644 index 00000000000..31581533aa4 --- /dev/null +++ b/bench/log.15Jul25.lj.fixed.g++.1 @@ -0,0 +1,88 @@ +LAMMPS (12 Jun 2025 - Development - patch_12Jun2025-808-g67067cbc80) + using 1 OpenMP thread(s) per MPI task +# 3d Lennard-Jones melt + +variable x index 1 +variable y index 1 +variable z index 1 + +variable xx equal 20*$x +variable xx equal 20*1 +variable yy equal 20*$y +variable yy equal 20*1 +variable zz equal 20*$z +variable zz equal 20*1 + +units lj +atom_style atomic + +lattice fcc 0.8442 +Lattice spacing in x,y,z = 1.6795962 1.6795962 1.6795962 +region box block 0 ${xx} 0 ${yy} 0 ${zz} +region box block 0 20 0 ${yy} 0 ${zz} +region box block 0 20 0 20 0 ${zz} +region box block 0 20 0 20 0 20 +create_box 1 box +Created orthogonal box = (0 0 0) to (33.591924 33.591924 33.591924) + 1 by 1 by 1 MPI processor grid +create_atoms 1 box +Created 32000 atoms + using lattice units in orthogonal box = (0 0 0) to (33.591924 33.591924 33.591924) + create_atoms CPU = 0.004 seconds +mass 1 1.0 + +velocity all create 1.44 87287 loop geom + +pair_style lj/cut 2.5 +pair_coeff 1 1 1.0 1.0 2.5 + +neighbor 0.3 bin +neigh_modify delay 0 every 20 check no + +fix 1 all nve + +run 100 +Generated 0 of 0 mixed pair_coeff terms from geometric mixing rule +Neighbor list info ... + update: every = 20 steps, delay = 0 steps, check = no + max neighbors/atom: 2000, page size: 100000 + master list distance cutoff = 2.8 + ghost atom cutoff = 2.8 + binsize = 1.4, bins = 24 24 24 + 1 neighbor lists, perpetual/occasional/extra = 1 0 0 + (1) pair lj/cut, perpetual + attributes: half, newton on + pair build: half/bin/atomonly/newton + stencil: half/bin/3d + bin: standard +Per MPI rank memory allocation (min/avg/max) = 13.82 | 13.82 | 13.82 Mbytes + Step Temp E_pair E_mol TotEng Press + 0 1.44 -6.7733681 0 -4.6134356 -5.0197073 + 100 0.7574531 -5.7585055 0 -4.6223613 0.20726105 +Loop time of 0.852877 on 1 procs for 100 steps with 32000 atoms + +Performance: 50652.077 tau/day, 117.250 timesteps/s, 3.752 Matom-step/s +99.6% CPU use with 1 MPI tasks x 1 OpenMP threads + +MPI task timing breakdown: +Section | min time | avg time | max time |%varavg| %total +--------------------------------------------------------------- +Pair | 0.70475 | 0.70475 | 0.70475 | 0.0 | 82.63 +Neigh | 0.12731 | 0.12731 | 0.12731 | 0.0 | 14.93 +Comm | 0.0062962 | 0.0062962 | 0.0062962 | 0.0 | 0.74 +Output | 9.7908e-05 | 9.7908e-05 | 9.7908e-05 | 0.0 | 0.01 +Modify | 0.012837 | 0.012837 | 0.012837 | 0.0 | 1.51 +Other | | 0.001579 | | | 0.19 + +Nlocal: 32000 ave 32000 max 32000 min +Histogram: 1 0 0 0 0 0 0 0 0 0 +Nghost: 19657 ave 19657 max 19657 min +Histogram: 1 0 0 0 0 0 0 0 0 0 +Neighs: 1.20283e+06 ave 1.20283e+06 max 1.20283e+06 min +Histogram: 1 0 0 0 0 0 0 0 0 0 + +Total # of neighbors = 1202833 +Ave neighs/atom = 37.588531 +Neighbor list builds = 5 +Dangerous builds not checked +Total wall time: 0:00:00 diff --git a/bench/log.15Jul25.lj.fixed.g++.4 b/bench/log.15Jul25.lj.fixed.g++.4 new file mode 100644 index 00000000000..9bf03cb43ae --- /dev/null +++ b/bench/log.15Jul25.lj.fixed.g++.4 @@ -0,0 +1,88 @@ +LAMMPS (12 Jun 2025 - Development - patch_12Jun2025-808-g67067cbc80) + using 1 OpenMP thread(s) per MPI task +# 3d Lennard-Jones melt + +variable x index 1 +variable y index 1 +variable z index 1 + +variable xx equal 20*$x +variable xx equal 20*1 +variable yy equal 20*$y +variable yy equal 20*1 +variable zz equal 20*$z +variable zz equal 20*1 + +units lj +atom_style atomic + +lattice fcc 0.8442 +Lattice spacing in x,y,z = 1.6795962 1.6795962 1.6795962 +region box block 0 ${xx} 0 ${yy} 0 ${zz} +region box block 0 20 0 ${yy} 0 ${zz} +region box block 0 20 0 20 0 ${zz} +region box block 0 20 0 20 0 20 +create_box 1 box +Created orthogonal box = (0 0 0) to (33.591924 33.591924 33.591924) + 1 by 2 by 2 MPI processor grid +create_atoms 1 box +Created 32000 atoms + using lattice units in orthogonal box = (0 0 0) to (33.591924 33.591924 33.591924) + create_atoms CPU = 0.001 seconds +mass 1 1.0 + +velocity all create 1.44 87287 loop geom + +pair_style lj/cut 2.5 +pair_coeff 1 1 1.0 1.0 2.5 + +neighbor 0.3 bin +neigh_modify delay 0 every 20 check no + +fix 1 all nve + +run 100 +Generated 0 of 0 mixed pair_coeff terms from geometric mixing rule +Neighbor list info ... + update: every = 20 steps, delay = 0 steps, check = no + max neighbors/atom: 2000, page size: 100000 + master list distance cutoff = 2.8 + ghost atom cutoff = 2.8 + binsize = 1.4, bins = 24 24 24 + 1 neighbor lists, perpetual/occasional/extra = 1 0 0 + (1) pair lj/cut, perpetual + attributes: half, newton on + pair build: half/bin/atomonly/newton + stencil: half/bin/3d + bin: standard +Per MPI rank memory allocation (min/avg/max) = 5.881 | 5.881 | 5.881 Mbytes + Step Temp E_pair E_mol TotEng Press + 0 1.44 -6.7733681 0 -4.6134356 -5.0197073 + 100 0.7574531 -5.7585055 0 -4.6223613 0.20726105 +Loop time of 0.230839 on 4 procs for 100 steps with 32000 atoms + +Performance: 187143.043 tau/day, 433.201 timesteps/s, 13.862 Matom-step/s +99.6% CPU use with 4 MPI tasks x 1 OpenMP threads + +MPI task timing breakdown: +Section | min time | avg time | max time |%varavg| %total +--------------------------------------------------------------- +Pair | 0.18211 | 0.18376 | 0.18584 | 0.3 | 79.60 +Neigh | 0.033629 | 0.033787 | 0.03418 | 0.1 | 14.64 +Comm | 0.007259 | 0.0092728 | 0.01113 | 1.5 | 4.02 +Output | 3.5578e-05 | 3.778e-05 | 4.1359e-05 | 0.0 | 0.02 +Modify | 0.0033806 | 0.0034297 | 0.0034832 | 0.1 | 1.49 +Other | | 0.0005558 | | | 0.24 + +Nlocal: 8000 ave 8037 max 7964 min +Histogram: 2 0 0 0 0 0 0 0 1 1 +Nghost: 9007.5 ave 9050 max 8968 min +Histogram: 1 1 0 0 0 0 0 1 0 1 +Neighs: 300708 ave 305113 max 297203 min +Histogram: 1 0 0 1 1 0 0 0 0 1 + +Total # of neighbors = 1202833 +Ave neighs/atom = 37.588531 +Neighbor list builds = 5 +Dangerous builds not checked +Total wall time: 0:00:00 diff --git a/bench/log.15Jul25.lj.scaled.g++.4 b/bench/log.15Jul25.lj.scaled.g++.4 new file mode 100644 index 00000000000..3354c1882c7 --- /dev/null +++ b/bench/log.15Jul25.lj.scaled.g++.4 @@ -0,0 +1,88 @@ +LAMMPS (12 Jun 2025 - Development - patch_12Jun2025-784-g8c564460e6-modified) + using 1 OpenMP thread(s) per MPI task +# 3d Lennard-Jones melt + +variable x index 1 +variable y index 1 +variable z index 1 + +variable xx equal 20*$x +variable xx equal 20*2 +variable yy equal 20*$y +variable yy equal 20*2 +variable zz equal 20*$z +variable zz equal 20*1 + +units lj +atom_style atomic + +lattice fcc 0.8442 +Lattice spacing in x,y,z = 1.6795962 1.6795962 1.6795962 +region box block 0 ${xx} 0 ${yy} 0 ${zz} +region box block 0 40 0 ${yy} 0 ${zz} +region box block 0 40 0 40 0 ${zz} +region box block 0 40 0 40 0 20 +create_box 1 box +Created orthogonal box = (0 0 0) to (67.183848 67.183848 33.591924) + 2 by 2 by 1 MPI processor grid +create_atoms 1 box +Created 128000 atoms + using lattice units in orthogonal box = (0 0 0) to (67.183848 67.183848 33.591924) + create_atoms CPU = 0.004 seconds +mass 1 1.0 + +velocity all create 1.44 87287 loop geom + +pair_style lj/cut 2.5 +pair_coeff 1 1 1.0 1.0 2.5 + +neighbor 0.3 bin +neigh_modify delay 0 every 20 check no + +fix 1 all nve + +run 100 +Generated 0 of 0 mixed pair_coeff terms from geometric mixing rule +Neighbor list info ... + update: every = 20 steps, delay = 0 steps, check = no + max neighbors/atom: 2000, page size: 100000 + master list distance cutoff = 2.8 + ghost atom cutoff = 2.8 + binsize = 1.4, bins = 48 48 24 + 1 neighbor lists, perpetual/occasional/extra = 1 0 0 + (1) pair lj/cut, perpetual + attributes: half, newton on + pair build: half/bin/atomonly/newton + stencil: half/bin/3d + bin: standard +Per MPI rank memory allocation (min/avg/max) = 14.12 | 14.12 | 14.12 Mbytes + Step Temp E_pair E_mol TotEng Press + 0 1.44 -6.7733681 0 -4.6133849 -5.0196788 + 100 0.75841891 -5.759957 0 -4.6223375 0.20008866 +Loop time of 0.961718 on 4 procs for 100 steps with 128000 atoms + +Performance: 44919.610 tau/day, 103.981 timesteps/s, 13.310 Matom-step/s +99.5% CPU use with 4 MPI tasks x 1 OpenMP threads + +MPI task timing breakdown: +Section | min time | avg time | max time |%varavg| %total +--------------------------------------------------------------- +Pair | 0.76816 | 0.7701 | 0.77277 | 0.2 | 80.08 +Neigh | 0.13077 | 0.13132 | 0.13191 | 0.1 | 13.65 +Comm | 0.026232 | 0.02948 | 0.03169 | 1.2 | 3.07 +Output | 0.00011422 | 0.00012437 | 0.0001401 | 0.0 | 0.01 +Modify | 0.026175 | 0.026553 | 0.026857 | 0.2 | 2.76 +Other | | 0.004141 | | | 0.43 + +Nlocal: 32000 ave 32060 max 31939 min +Histogram: 1 0 1 0 0 0 0 1 0 1 +Nghost: 19630.8 ave 19681 max 19562 min +Histogram: 1 0 0 0 1 0 0 0 1 1 +Neighs: 1.20195e+06 ave 1.20354e+06 max 1.19931e+06 min +Histogram: 1 0 0 0 0 0 0 2 0 1 + +Total # of neighbors = 4807797 +Ave neighs/atom = 37.560914 +Neighbor list builds = 5 +Dangerous builds not checked +Total wall time: 0:00:01 diff --git a/bench/log.15Jul25.rhodo.fixed.g++.1 b/bench/log.15Jul25.rhodo.fixed.g++.1 new file mode 100644 index 00000000000..e3e7b29e37c --- /dev/null +++ b/bench/log.15Jul25.rhodo.fixed.g++.1 @@ -0,0 +1,139 @@ +LAMMPS (12 Jun 2025 - Development - patch_12Jun2025-808-g67067cbc80) + using 1 OpenMP thread(s) per MPI task +# Rhodopsin model + +units real +neigh_modify delay 5 every 1 + +atom_style full +bond_style harmonic +angle_style charmm +dihedral_style charmm +improper_style harmonic +pair_style lj/charmm/coul/long 8.0 10.0 +pair_modify mix arithmetic +kspace_style pppm 1e-4 + +read_data data.rhodo +Reading data file ... + orthogonal box = (-27.5 -38.5 -36.3646) to (27.5 38.5 36.3615) + 1 by 1 by 1 MPI processor grid + reading atoms ... + 32000 atoms + reading velocities ... + 32000 velocities + scanning bonds ... + 4 = max bonds/atom + scanning angles ... + 8 = max angles/atom + scanning dihedrals ... + 18 = max dihedrals/atom + scanning impropers ... + 2 = max impropers/atom + orthogonal box = (-27.5 -38.5 -36.3646) to (27.5 38.5 36.3615) + 1 by 1 by 1 MPI processor grid + reading bonds ... + 27723 bonds + reading angles ... + 40467 angles + reading dihedrals ... + 56829 dihedrals + reading impropers ... + 1034 impropers +Finding 1-2 1-3 1-4 neighbors ... + special bond factors lj: 0 0 0 + special bond factors coul: 0 0 0 + 4 = max # of 1-2 neighbors + 12 = max # of 1-3 neighbors + 24 = max # of 1-4 neighbors + 26 = max # of special neighbors + special bonds CPU = 0.009 seconds + read_data CPU = 0.236 seconds + +fix 1 all shake 0.0001 5 0 m 1.0 a 232 +Finding SHAKE clusters ... + 1617 = # of size 2 clusters + 3633 = # of size 3 clusters + 747 = # of size 4 clusters + 4233 = # of frozen angles + find clusters CPU = 0.006 seconds +fix 2 all npt temp 300.0 300.0 100.0 z 0.0 0.0 1000.0 mtk no pchain 0 tchain 1 + +special_bonds charmm + +thermo 50 +thermo_style multi +timestep 2.0 + +run 100 +PPPM initialization ... + using 12-bit tables for long-range coulomb + G vector (1/distance) = 0.24883488 + grid = 25 32 32 + stencil order = 5 + estimated absolute RMS force accuracy = 0.035547797 + estimated relative force accuracy = 0.00010705113 + using double precision FFTW3 + 3d grid and FFT values/proc = 41070 25600 +Generated 2278 of 2278 mixed pair_coeff terms from arithmetic mixing rule +Neighbor list info ... + update: every = 1 steps, delay = 5 steps, check = yes + max neighbors/atom: 2000, page size: 100000 + master list distance cutoff = 12 + ghost atom cutoff = 12 + binsize = 6, bins = 10 13 13 + 1 neighbor lists, perpetual/occasional/extra = 1 0 0 + (1) pair lj/charmm/coul/long, perpetual + attributes: half, newton on + pair build: half/bin/newton + stencil: half/bin/3d + bin: standard +Per MPI rank memory allocation (min/avg/max) = 140 | 140 | 140 Mbytes +------------ Step 0 ----- CPU = 0 (sec) ------------- +TotEng = -25356.2057 KinEng = 21444.8313 Temp = 299.0397 +PotEng = -46801.0370 E_bond = 2537.9940 E_angle = 10921.3742 +E_dihed = 5211.7865 E_impro = 213.5116 E_vdwl = -2307.8634 +E_coul = 207025.8934 E_long = -270403.7333 Press = -149.3300 +Volume = 307995.0335 +------------ Step 50 ----- CPU = 6.946771 (sec) ------------- +TotEng = -25330.0307 KinEng = 21501.0009 Temp = 299.8229 +PotEng = -46831.0316 E_bond = 2471.7035 E_angle = 10836.5102 +E_dihed = 5239.6319 E_impro = 227.1218 E_vdwl = -1993.2873 +E_coul = 206797.6807 E_long = -270410.3925 Press = 237.6572 +Volume = 308031.6762 +------------ Step 100 ----- CPU = 14.20402 (sec) ------------- +TotEng = -25290.7364 KinEng = 21591.9089 Temp = 301.0906 +PotEng = -46882.6454 E_bond = 2567.9807 E_angle = 10781.9571 +E_dihed = 5198.7492 E_impro = 216.7864 E_vdwl = -1902.6616 +E_coul = 206659.5159 E_long = -270404.9730 Press = 6.7352 +Volume = 308134.2286 +Loop time of 14.2041 on 1 procs for 100 steps with 32000 atoms + +Performance: 1.217 ns/day, 19.728 hours/ns, 7.040 timesteps/s, 225.288 katom-step/s +99.6% CPU use with 1 MPI tasks x 1 OpenMP threads + +MPI task timing breakdown: +Section | min time | avg time | max time |%varavg| %total +--------------------------------------------------------------- +Pair | 11.001 | 11.001 | 11.001 | 0.0 | 77.45 +Bond | 0.38231 | 0.38231 | 0.38231 | 0.0 | 2.69 +Kspace | 0.6111 | 0.6111 | 0.6111 | 0.0 | 4.30 +Neigh | 1.891 | 1.891 | 1.891 | 0.0 | 13.31 +Comm | 0.021749 | 0.021749 | 0.021749 | 0.0 | 0.15 +Output | 0.00021602 | 0.00021602 | 0.00021602 | 0.0 | 0.00 +Modify | 0.29015 | 0.29015 | 0.29015 | 0.0 | 2.04 +Other | | 0.006949 | | | 0.05 + +Nlocal: 32000 ave 32000 max 32000 min +Histogram: 1 0 0 0 0 0 0 0 0 0 +Nghost: 47958 ave 47958 max 47958 min +Histogram: 1 0 0 0 0 0 0 0 0 0 +Neighs: 1.20281e+07 ave 1.20281e+07 max 1.20281e+07 min +Histogram: 1 0 0 0 0 0 0 0 0 0 + +Total # of neighbors = 12028093 +Ave neighs/atom = 375.87791 +Ave special neighs/atom = 7.431875 +Neighbor list builds = 11 +Dangerous builds = 0 +Total wall time: 0:00:14 diff --git a/bench/log.15Jul25.rhodo.fixed.g++.4 b/bench/log.15Jul25.rhodo.fixed.g++.4 new file mode 100644 index 00000000000..4defa96dbe5 --- /dev/null +++ b/bench/log.15Jul25.rhodo.fixed.g++.4 @@ -0,0 +1,139 @@ +LAMMPS (12 Jun 2025 - Development - patch_12Jun2025-808-g67067cbc80) + using 1 OpenMP thread(s) per MPI task +# Rhodopsin model + +units real +neigh_modify delay 5 every 1 + +atom_style full +bond_style harmonic +angle_style charmm +dihedral_style charmm +improper_style harmonic +pair_style lj/charmm/coul/long 8.0 10.0 +pair_modify mix arithmetic +kspace_style pppm 1e-4 + +read_data data.rhodo +Reading data file ... + orthogonal box = (-27.5 -38.5 -36.3646) to (27.5 38.5 36.3615) + 1 by 2 by 2 MPI processor grid + reading atoms ... + 32000 atoms + reading velocities ... + 32000 velocities + scanning bonds ... + 4 = max bonds/atom + scanning angles ... + 8 = max angles/atom + scanning dihedrals ... + 18 = max dihedrals/atom + scanning impropers ... + 2 = max impropers/atom + orthogonal box = (-27.5 -38.5 -36.3646) to (27.5 38.5 36.3615) + 1 by 2 by 2 MPI processor grid + reading bonds ... + 27723 bonds + reading angles ... + 40467 angles + reading dihedrals ... + 56829 dihedrals + reading impropers ... + 1034 impropers +Finding 1-2 1-3 1-4 neighbors ... + special bond factors lj: 0 0 0 + special bond factors coul: 0 0 0 + 4 = max # of 1-2 neighbors + 12 = max # of 1-3 neighbors + 24 = max # of 1-4 neighbors + 26 = max # of special neighbors + special bonds CPU = 0.004 seconds + read_data CPU = 0.218 seconds + +fix 1 all shake 0.0001 5 0 m 1.0 a 232 +Finding SHAKE clusters ... + 1617 = # of size 2 clusters + 3633 = # of size 3 clusters + 747 = # of size 4 clusters + 4233 = # of frozen angles + find clusters CPU = 0.002 seconds +fix 2 all npt temp 300.0 300.0 100.0 z 0.0 0.0 1000.0 mtk no pchain 0 tchain 1 + +special_bonds charmm + +thermo 50 +thermo_style multi +timestep 2.0 + +run 100 +PPPM initialization ... + using 12-bit tables for long-range coulomb + G vector (1/distance) = 0.24883488 + grid = 25 32 32 + stencil order = 5 + estimated absolute RMS force accuracy = 0.035547797 + estimated relative force accuracy = 0.00010705113 + using double precision FFTW3 + 3d grid and FFT values/proc = 13230 6400 +Generated 2278 of 2278 mixed pair_coeff terms from arithmetic mixing rule +Neighbor list info ... + update: every = 1 steps, delay = 5 steps, check = yes + max neighbors/atom: 2000, page size: 100000 + master list distance cutoff = 12 + ghost atom cutoff = 12 + binsize = 6, bins = 10 13 13 + 1 neighbor lists, perpetual/occasional/extra = 1 0 0 + (1) pair lj/charmm/coul/long, perpetual + attributes: half, newton on + pair build: half/bin/newton + stencil: half/bin/3d + bin: standard +Per MPI rank memory allocation (min/avg/max) = 49.25 | 49.35 | 49.64 Mbytes +------------ Step 0 ----- CPU = 0 (sec) ------------- +TotEng = -25356.2057 KinEng = 21444.8313 Temp = 299.0397 +PotEng = -46801.0370 E_bond = 2537.9940 E_angle = 10921.3742 +E_dihed = 5211.7865 E_impro = 213.5116 E_vdwl = -2307.8634 +E_coul = 207025.8934 E_long = -270403.7333 Press = -149.3300 +Volume = 307995.0335 +------------ Step 50 ----- CPU = 1.894152 (sec) ------------- +TotEng = -25330.0307 KinEng = 21501.0009 Temp = 299.8229 +PotEng = -46831.0316 E_bond = 2471.7035 E_angle = 10836.5102 +E_dihed = 5239.6319 E_impro = 227.1218 E_vdwl = -1993.2873 +E_coul = 206797.6807 E_long = -270410.3925 Press = 237.6572 +Volume = 308031.6762 +------------ Step 100 ----- CPU = 3.886163 (sec) ------------- +TotEng = -25290.7364 KinEng = 21591.9089 Temp = 301.0906 +PotEng = -46882.6453 E_bond = 2567.9807 E_angle = 10781.9571 +E_dihed = 5198.7492 E_impro = 216.7864 E_vdwl = -1902.6616 +E_coul = 206659.5159 E_long = -270404.9730 Press = 6.7352 +Volume = 308134.2286 +Loop time of 3.8862 on 4 procs for 100 steps with 32000 atoms + +Performance: 4.447 ns/day, 5.397 hours/ns, 25.732 timesteps/s, 823.427 katom-step/s +99.3% CPU use with 4 MPI tasks x 1 OpenMP threads + +MPI task timing breakdown: +Section | min time | avg time | max time |%varavg| %total +--------------------------------------------------------------- +Pair | 2.8294 | 2.8755 | 2.9708 | 3.3 | 73.99 +Bond | 0.092202 | 0.096409 | 0.10326 | 1.3 | 2.48 +Kspace | 0.18009 | 0.27454 | 0.32256 | 10.6 | 7.06 +Neigh | 0.5049 | 0.50495 | 0.50501 | 0.0 | 12.99 +Comm | 0.029701 | 0.029904 | 0.030144 | 0.1 | 0.77 +Output | 0.00010524 | 0.00010923 | 0.00012092 | 0.0 | 0.00 +Modify | 0.098097 | 0.09846 | 0.098727 | 0.1 | 2.53 +Other | | 0.006314 | | | 0.16 + +Nlocal: 8000 ave 8143 max 7933 min +Histogram: 1 2 0 0 0 0 0 0 0 1 +Nghost: 22733.5 ave 22769 max 22693 min +Histogram: 1 0 0 0 0 2 0 0 0 1 +Neighs: 3.00702e+06 ave 3.0975e+06 max 2.96492e+06 min +Histogram: 1 2 0 0 0 0 0 0 0 1 + +Total # of neighbors = 12028093 +Ave neighs/atom = 375.87791 +Ave special neighs/atom = 7.431875 +Neighbor list builds = 11 +Dangerous builds = 0 +Total wall time: 0:00:04 diff --git a/bench/log.15Jul25.rhodo.scaled.g++.4 b/bench/log.15Jul25.rhodo.scaled.g++.4 new file mode 100644 index 00000000000..37fbce6468f --- /dev/null +++ b/bench/log.15Jul25.rhodo.scaled.g++.4 @@ -0,0 +1,139 @@ +LAMMPS (12 Jun 2025 - Development - patch_12Jun2025-784-g8c564460e6-modified) + using 1 OpenMP thread(s) per MPI task +# Rhodopsin model + +units real +neigh_modify delay 5 every 1 + +atom_style full +bond_style harmonic +angle_style charmm +dihedral_style charmm +improper_style harmonic +pair_style lj/charmm/coul/long 8.0 10.0 +pair_modify mix arithmetic +kspace_style pppm 1e-4 + +read_data data.rhodo +Reading data file ... + orthogonal box = (-27.5 -38.5 -36.3646) to (27.5 38.5 36.3615) + 1 by 2 by 2 MPI processor grid + reading atoms ... + 32000 atoms + reading velocities ... + 32000 velocities + scanning bonds ... + 4 = max bonds/atom + scanning angles ... + 8 = max angles/atom + scanning dihedrals ... + 18 = max dihedrals/atom + scanning impropers ... + 2 = max impropers/atom + orthogonal box = (-27.5 -38.5 -36.3646) to (27.5 38.5 36.3615) + 1 by 2 by 2 MPI processor grid + reading bonds ... + 27723 bonds + reading angles ... + 40467 angles + reading dihedrals ... + 56829 dihedrals + reading impropers ... + 1034 impropers +Finding 1-2 1-3 1-4 neighbors ... + special bond factors lj: 0 0 0 + special bond factors coul: 0 0 0 + 4 = max # of 1-2 neighbors + 12 = max # of 1-3 neighbors + 24 = max # of 1-4 neighbors + 26 = max # of special neighbors + special bonds CPU = 0.003 seconds + read_data CPU = 0.221 seconds + +fix 1 all shake 0.0001 5 0 m 1.0 a 232 +Finding SHAKE clusters ... + 1617 = # of size 2 clusters + 3633 = # of size 3 clusters + 747 = # of size 4 clusters + 4233 = # of frozen angles + find clusters CPU = 0.002 seconds +fix 2 all npt temp 300.0 300.0 100.0 z 0.0 0.0 1000.0 mtk no pchain 0 tchain 1 + +special_bonds charmm + +thermo 50 +thermo_style multi +timestep 2.0 + +run 100 +PPPM initialization ... + using 12-bit tables for long-range coulomb + G vector (1/distance) = 0.24883488 + grid = 25 32 32 + stencil order = 5 + estimated absolute RMS force accuracy = 0.035547797 + estimated relative force accuracy = 0.00010705113 + using double precision FFTW3 + 3d grid and FFT values/proc = 13230 6400 +Generated 2278 of 2278 mixed pair_coeff terms from arithmetic mixing rule +Neighbor list info ... + update: every = 1 steps, delay = 5 steps, check = yes + max neighbors/atom: 2000, page size: 100000 + master list distance cutoff = 12 + ghost atom cutoff = 12 + binsize = 6, bins = 10 13 13 + 1 neighbor lists, perpetual/occasional/extra = 1 0 0 + (1) pair lj/charmm/coul/long, perpetual + attributes: half, newton on + pair build: half/bin/newton + stencil: half/bin/3d + bin: standard +Per MPI rank memory allocation (min/avg/max) = 49.25 | 49.35 | 49.64 Mbytes +------------ Step 0 ----- CPU = 0 (sec) ------------- +TotEng = -25356.2057 KinEng = 21444.8313 Temp = 299.0397 +PotEng = -46801.0370 E_bond = 2537.9940 E_angle = 10921.3742 +E_dihed = 5211.7865 E_impro = 213.5116 E_vdwl = -2307.8634 +E_coul = 207025.8934 E_long = -270403.7333 Press = -149.3300 +Volume = 307995.0335 +------------ Step 50 ----- CPU = 1.889623 (sec) ------------- +TotEng = -25330.0307 KinEng = 21501.0009 Temp = 299.8229 +PotEng = -46831.0316 E_bond = 2471.7035 E_angle = 10836.5102 +E_dihed = 5239.6319 E_impro = 227.1218 E_vdwl = -1993.2873 +E_coul = 206797.6807 E_long = -270410.3925 Press = 237.6572 +Volume = 308031.6762 +------------ Step 100 ----- CPU = 3.870869 (sec) ------------- +TotEng = -25290.7364 KinEng = 21591.9089 Temp = 301.0906 +PotEng = -46882.6453 E_bond = 2567.9807 E_angle = 10781.9571 +E_dihed = 5198.7492 E_impro = 216.7864 E_vdwl = -1902.6616 +E_coul = 206659.5159 E_long = -270404.9730 Press = 6.7352 +Volume = 308134.2286 +Loop time of 3.8709 on 4 procs for 100 steps with 32000 atoms + +Performance: 4.464 ns/day, 5.376 hours/ns, 25.834 timesteps/s, 826.680 katom-step/s +99.3% CPU use with 4 MPI tasks x 1 OpenMP threads + +MPI task timing breakdown: +Section | min time | avg time | max time |%varavg| %total +--------------------------------------------------------------- +Pair | 2.8153 | 2.8576 | 2.9543 | 3.4 | 73.82 +Bond | 0.092732 | 0.096503 | 0.10317 | 1.3 | 2.49 +Kspace | 0.17705 | 0.27418 | 0.31978 | 10.9 | 7.08 +Neigh | 0.50993 | 0.51001 | 0.51006 | 0.0 | 13.18 +Comm | 0.028631 | 0.028776 | 0.028899 | 0.1 | 0.74 +Output | 9.7056e-05 | 0.00010098 | 0.00011123 | 0.0 | 0.00 +Modify | 0.09746 | 0.097676 | 0.098001 | 0.1 | 2.52 +Other | | 0.006022 | | | 0.16 + +Nlocal: 8000 ave 8143 max 7933 min +Histogram: 1 2 0 0 0 0 0 0 0 1 +Nghost: 22733.5 ave 22769 max 22693 min +Histogram: 1 0 0 0 0 2 0 0 0 1 +Neighs: 3.00702e+06 ave 3.0975e+06 max 2.96492e+06 min +Histogram: 1 2 0 0 0 0 0 0 0 1 + +Total # of neighbors = 12028093 +Ave neighs/atom = 375.87791 +Ave special neighs/atom = 7.431875 +Neighbor list builds = 11 +Dangerous builds = 0 +Total wall time: 0:00:04 diff --git a/bench/log.6Oct16.chain.fixed.icc.1 b/bench/log.6Oct16.chain.fixed.icc.1 deleted file mode 100644 index d1279b8ca1a..00000000000 --- a/bench/log.6Oct16.chain.fixed.icc.1 +++ /dev/null @@ -1,78 +0,0 @@ -LAMMPS (6 Oct 2016) -# FENE beadspring benchmark - -units lj -atom_style bond -special_bonds fene - -read_data data.chain - orthogonal box = (-16.796 -16.796 -16.796) to (16.796 16.796 16.796) - 1 by 1 by 1 MPI processor grid - reading atoms ... - 32000 atoms - reading velocities ... - 32000 velocities - scanning bonds ... - 1 = max bonds/atom - reading bonds ... - 31680 bonds - 2 = max # of 1-2 neighbors - 2 = max # of special neighbors - -neighbor 0.4 bin -neigh_modify every 1 delay 1 - -bond_style fene -bond_coeff 1 30.0 1.5 1.0 1.0 - -pair_style lj/cut 1.12 -pair_modify shift yes -pair_coeff 1 1 1.0 1.0 1.12 - -fix 1 all nve -fix 2 all langevin 1.0 1.0 10.0 904297 - -thermo 100 -timestep 0.012 - -run 100 -Neighbor list info ... - 1 neighbor list requests - update every 1 steps, delay 1 steps, check yes - max neighbors/atom: 2000, page size: 100000 - master list distance cutoff = 1.52 - ghost atom cutoff = 1.52 - binsize = 0.76 -> bins = 45 45 45 -Memory usage per processor = 12.0423 Mbytes -Step Temp E_pair E_mol TotEng Press - 0 0.97029772 0.44484087 20.494523 22.394765 4.6721833 - 100 0.9729966 0.4361122 20.507698 22.40326 4.6548819 -Loop time of 0.977647 on 1 procs for 100 steps with 32000 atoms - -Performance: 106050.541 tau/day, 102.286 timesteps/s -99.9% CPU use with 1 MPI tasks x no OpenMP threads - -MPI task timing breakdown: -Section | min time | avg time | max time |%varavg| %total ---------------------------------------------------------------- -Pair | 0.19421 | 0.19421 | 0.19421 | 0.0 | 19.86 -Bond | 0.08741 | 0.08741 | 0.08741 | 0.0 | 8.94 -Neigh | 0.45791 | 0.45791 | 0.45791 | 0.0 | 46.84 -Comm | 0.032649 | 0.032649 | 0.032649 | 0.0 | 3.34 -Output | 0.00012207 | 0.00012207 | 0.00012207 | 0.0 | 0.01 -Modify | 0.18071 | 0.18071 | 0.18071 | 0.0 | 18.48 -Other | | 0.02464 | | | 2.52 - -Nlocal: 32000 ave 32000 max 32000 min -Histogram: 1 0 0 0 0 0 0 0 0 0 -Nghost: 9493 ave 9493 max 9493 min -Histogram: 1 0 0 0 0 0 0 0 0 0 -Neighs: 155873 ave 155873 max 155873 min -Histogram: 1 0 0 0 0 0 0 0 0 0 - -Total # of neighbors = 155873 -Ave neighs/atom = 4.87103 -Ave special neighs/atom = 1.98 -Neighbor list builds = 25 -Dangerous builds = 0 -Total wall time: 0:00:01 diff --git a/bench/log.6Oct16.chain.fixed.icc.4 b/bench/log.6Oct16.chain.fixed.icc.4 deleted file mode 100644 index ce088d20a68..00000000000 --- a/bench/log.6Oct16.chain.fixed.icc.4 +++ /dev/null @@ -1,78 +0,0 @@ -LAMMPS (6 Oct 2016) -# FENE beadspring benchmark - -units lj -atom_style bond -special_bonds fene - -read_data data.chain - orthogonal box = (-16.796 -16.796 -16.796) to (16.796 16.796 16.796) - 1 by 2 by 2 MPI processor grid - reading atoms ... - 32000 atoms - reading velocities ... - 32000 velocities - scanning bonds ... - 1 = max bonds/atom - reading bonds ... - 31680 bonds - 2 = max # of 1-2 neighbors - 2 = max # of special neighbors - -neighbor 0.4 bin -neigh_modify every 1 delay 1 - -bond_style fene -bond_coeff 1 30.0 1.5 1.0 1.0 - -pair_style lj/cut 1.12 -pair_modify shift yes -pair_coeff 1 1 1.0 1.0 1.12 - -fix 1 all nve -fix 2 all langevin 1.0 1.0 10.0 904297 - -thermo 100 -timestep 0.012 - -run 100 -Neighbor list info ... - 1 neighbor list requests - update every 1 steps, delay 1 steps, check yes - max neighbors/atom: 2000, page size: 100000 - master list distance cutoff = 1.52 - ghost atom cutoff = 1.52 - binsize = 0.76 -> bins = 45 45 45 -Memory usage per processor = 4.14663 Mbytes -Step Temp E_pair E_mol TotEng Press - 0 0.97029772 0.44484087 20.494523 22.394765 4.6721833 - 100 0.97145835 0.43803883 20.502691 22.397872 4.626988 -Loop time of 0.269205 on 4 procs for 100 steps with 32000 atoms - -Performance: 385133.446 tau/day, 371.464 timesteps/s -99.8% CPU use with 4 MPI tasks x no OpenMP threads - -MPI task timing breakdown: -Section | min time | avg time | max time |%varavg| %total ---------------------------------------------------------------- -Pair | 0.049383 | 0.049756 | 0.049988 | 0.1 | 18.48 -Bond | 0.022701 | 0.022813 | 0.022872 | 0.0 | 8.47 -Neigh | 0.11982 | 0.12002 | 0.12018 | 0.0 | 44.58 -Comm | 0.020274 | 0.021077 | 0.022348 | 0.5 | 7.83 -Output | 5.3167e-05 | 5.6148e-05 | 6.3181e-05 | 0.1 | 0.02 -Modify | 0.046276 | 0.046809 | 0.047016 | 0.1 | 17.39 -Other | | 0.008669 | | | 3.22 - -Nlocal: 8000 ave 8030 max 7974 min -Histogram: 1 0 0 1 0 1 0 0 0 1 -Nghost: 4177 ave 4191 max 4160 min -Histogram: 1 0 0 0 1 0 0 1 0 1 -Neighs: 38995.8 ave 39169 max 38852 min -Histogram: 1 0 0 1 1 0 0 0 0 1 - -Total # of neighbors = 155983 -Ave neighs/atom = 4.87447 -Ave special neighs/atom = 1.98 -Neighbor list builds = 25 -Dangerous builds = 0 -Total wall time: 0:00:00 diff --git a/bench/log.6Oct16.chain.scaled.icc.4 b/bench/log.6Oct16.chain.scaled.icc.4 deleted file mode 100644 index 2f2d47d78b2..00000000000 --- a/bench/log.6Oct16.chain.scaled.icc.4 +++ /dev/null @@ -1,94 +0,0 @@ -LAMMPS (6 Oct 2016) -# FENE beadspring benchmark - -variable x index 1 -variable y index 1 -variable z index 1 - -units lj -atom_style bond -atom_modify map hash -special_bonds fene - -read_data data.chain - orthogonal box = (-16.796 -16.796 -16.796) to (16.796 16.796 16.796) - 1 by 2 by 2 MPI processor grid - reading atoms ... - 32000 atoms - reading velocities ... - 32000 velocities - scanning bonds ... - 1 = max bonds/atom - reading bonds ... - 31680 bonds - 2 = max # of 1-2 neighbors - 2 = max # of special neighbors - -replicate $x $y $z -replicate 2 $y $z -replicate 2 2 $z -replicate 2 2 1 - orthogonal box = (-16.796 -16.796 -16.796) to (50.388 50.388 16.796) - 2 by 2 by 1 MPI processor grid - 128000 atoms - 126720 bonds - 2 = max # of 1-2 neighbors - 2 = max # of special neighbors - -neighbor 0.4 bin -neigh_modify every 1 delay 1 - -bond_style fene -bond_coeff 1 30.0 1.5 1.0 1.0 - -pair_style lj/cut 1.12 -pair_modify shift yes -pair_coeff 1 1 1.0 1.0 1.12 - -fix 1 all nve -fix 2 all langevin 1.0 1.0 10.0 904297 - -thermo 100 -timestep 0.012 - -run 100 -Neighbor list info ... - 1 neighbor list requests - update every 1 steps, delay 1 steps, check yes - max neighbors/atom: 2000, page size: 100000 - master list distance cutoff = 1.52 - ghost atom cutoff = 1.52 - binsize = 0.76 -> bins = 89 89 45 -Memory usage per processor = 13.2993 Mbytes -Step Temp E_pair E_mol TotEng Press - 0 0.97027498 0.44484087 20.494523 22.394765 4.6721833 - 100 0.97682955 0.44239968 20.500229 22.407862 4.6527025 -Loop time of 1.14845 on 4 procs for 100 steps with 128000 atoms - -Performance: 90277.919 tau/day, 87.074 timesteps/s -99.9% CPU use with 4 MPI tasks x no OpenMP threads - -MPI task timing breakdown: -Section | min time | avg time | max time |%varavg| %total ---------------------------------------------------------------- -Pair | 0.2203 | 0.22207 | 0.22386 | 0.3 | 19.34 -Bond | 0.094861 | 0.095302 | 0.095988 | 0.1 | 8.30 -Neigh | 0.52127 | 0.5216 | 0.52189 | 0.0 | 45.42 -Comm | 0.079585 | 0.082159 | 0.084366 | 0.7 | 7.15 -Output | 0.00013304 | 0.00015306 | 0.00018501 | 0.2 | 0.01 -Modify | 0.18351 | 0.18419 | 0.1856 | 0.2 | 16.04 -Other | | 0.04298 | | | 3.74 - -Nlocal: 32000 ave 32015 max 31983 min -Histogram: 1 0 1 0 0 0 0 0 1 1 -Nghost: 9492 ave 9522 max 9432 min -Histogram: 1 0 0 0 0 0 1 0 0 2 -Neighs: 155837 ave 156079 max 155506 min -Histogram: 1 0 0 0 0 1 0 0 1 1 - -Total # of neighbors = 623349 -Ave neighs/atom = 4.86991 -Ave special neighs/atom = 1.98 -Neighbor list builds = 25 -Dangerous builds = 0 -Total wall time: 0:00:01 diff --git a/bench/log.6Oct16.chute.fixed.icc.1 b/bench/log.6Oct16.chute.fixed.icc.1 deleted file mode 100644 index 9f53d44092a..00000000000 --- a/bench/log.6Oct16.chute.fixed.icc.1 +++ /dev/null @@ -1,80 +0,0 @@ -LAMMPS (6 Oct 2016) -# LAMMPS benchmark of granular flow -# chute flow of 32000 atoms with frozen base at 26 degrees - -units lj -atom_style sphere -boundary p p fs -newton off -comm_modify vel yes - -read_data data.chute - orthogonal box = (0 0 0) to (40 20 37.2886) - 1 by 1 by 1 MPI processor grid - reading atoms ... - 32000 atoms - reading velocities ... - 32000 velocities - -pair_style gran/hooke/history 200000.0 NULL 50.0 NULL 0.5 0 -pair_coeff * * - -neighbor 0.1 bin -neigh_modify every 1 delay 0 - -timestep 0.0001 - -group bottom type 2 -912 atoms in group bottom -group active subtract all bottom -31088 atoms in group active -neigh_modify exclude group bottom bottom - -fix 1 all gravity 1.0 chute 26.0 -fix 2 bottom freeze -fix 3 active nve/sphere - -compute 1 all erotate/sphere -thermo_style custom step atoms ke c_1 vol -thermo_modify norm no -thermo 100 - -run 100 -Neighbor list info ... - 2 neighbor list requests - update every 1 steps, delay 0 steps, check yes - max neighbors/atom: 2000, page size: 100000 - master list distance cutoff = 1.1 - ghost atom cutoff = 1.1 - binsize = 0.55 -> bins = 73 37 68 -Memory usage per processor = 16.0904 Mbytes -Step Atoms KinEng c_1 Volume - 0 32000 784139.13 1601.1263 29833.783 - 100 32000 784292.08 1571.0968 29834.707 -Loop time of 0.534174 on 1 procs for 100 steps with 32000 atoms - -Performance: 1617.451 tau/day, 187.205 timesteps/s -99.8% CPU use with 1 MPI tasks x no OpenMP threads - -MPI task timing breakdown: -Section | min time | avg time | max time |%varavg| %total ---------------------------------------------------------------- -Pair | 0.33346 | 0.33346 | 0.33346 | 0.0 | 62.43 -Neigh | 0.043902 | 0.043902 | 0.043902 | 0.0 | 8.22 -Comm | 0.018391 | 0.018391 | 0.018391 | 0.0 | 3.44 -Output | 0.00022411 | 0.00022411 | 0.00022411 | 0.0 | 0.04 -Modify | 0.11666 | 0.11666 | 0.11666 | 0.0 | 21.84 -Other | | 0.02153 | | | 4.03 - -Nlocal: 32000 ave 32000 max 32000 min -Histogram: 1 0 0 0 0 0 0 0 0 0 -Nghost: 5463 ave 5463 max 5463 min -Histogram: 1 0 0 0 0 0 0 0 0 0 -Neighs: 115133 ave 115133 max 115133 min -Histogram: 1 0 0 0 0 0 0 0 0 0 - -Total # of neighbors = 115133 -Ave neighs/atom = 3.59791 -Neighbor list builds = 2 -Dangerous builds = 0 -Total wall time: 0:00:00 diff --git a/bench/log.6Oct16.chute.fixed.icc.4 b/bench/log.6Oct16.chute.fixed.icc.4 deleted file mode 100644 index a75a7c1f014..00000000000 --- a/bench/log.6Oct16.chute.fixed.icc.4 +++ /dev/null @@ -1,80 +0,0 @@ -LAMMPS (6 Oct 2016) -# LAMMPS benchmark of granular flow -# chute flow of 32000 atoms with frozen base at 26 degrees - -units lj -atom_style sphere -boundary p p fs -newton off -comm_modify vel yes - -read_data data.chute - orthogonal box = (0 0 0) to (40 20 37.2886) - 2 by 1 by 2 MPI processor grid - reading atoms ... - 32000 atoms - reading velocities ... - 32000 velocities - -pair_style gran/hooke/history 200000.0 NULL 50.0 NULL 0.5 0 -pair_coeff * * - -neighbor 0.1 bin -neigh_modify every 1 delay 0 - -timestep 0.0001 - -group bottom type 2 -912 atoms in group bottom -group active subtract all bottom -31088 atoms in group active -neigh_modify exclude group bottom bottom - -fix 1 all gravity 1.0 chute 26.0 -fix 2 bottom freeze -fix 3 active nve/sphere - -compute 1 all erotate/sphere -thermo_style custom step atoms ke c_1 vol -thermo_modify norm no -thermo 100 - -run 100 -Neighbor list info ... - 2 neighbor list requests - update every 1 steps, delay 0 steps, check yes - max neighbors/atom: 2000, page size: 100000 - master list distance cutoff = 1.1 - ghost atom cutoff = 1.1 - binsize = 0.55 -> bins = 73 37 68 -Memory usage per processor = 7.04927 Mbytes -Step Atoms KinEng c_1 Volume - 0 32000 784139.13 1601.1263 29833.783 - 100 32000 784292.08 1571.0968 29834.707 -Loop time of 0.171815 on 4 procs for 100 steps with 32000 atoms - -Performance: 5028.653 tau/day, 582.020 timesteps/s -99.7% CPU use with 4 MPI tasks x no OpenMP threads - -MPI task timing breakdown: -Section | min time | avg time | max time |%varavg| %total ---------------------------------------------------------------- -Pair | 0.093691 | 0.096898 | 0.10005 | 0.8 | 56.40 -Neigh | 0.011976 | 0.012059 | 0.012146 | 0.1 | 7.02 -Comm | 0.016384 | 0.017418 | 0.018465 | 0.8 | 10.14 -Output | 7.7963e-05 | 0.00010747 | 0.00013304 | 0.2 | 0.06 -Modify | 0.031744 | 0.031943 | 0.032167 | 0.1 | 18.59 -Other | | 0.01339 | | | 7.79 - -Nlocal: 8000 ave 8008 max 7992 min -Histogram: 2 0 0 0 0 0 0 0 0 2 -Nghost: 2439 ave 2450 max 2428 min -Histogram: 2 0 0 0 0 0 0 0 0 2 -Neighs: 29500.5 ave 30488 max 28513 min -Histogram: 2 0 0 0 0 0 0 0 0 2 - -Total # of neighbors = 118002 -Ave neighs/atom = 3.68756 -Neighbor list builds = 2 -Dangerous builds = 0 -Total wall time: 0:00:00 diff --git a/bench/log.6Oct16.chute.scaled.icc.4 b/bench/log.6Oct16.chute.scaled.icc.4 deleted file mode 100644 index 0538e9fbe54..00000000000 --- a/bench/log.6Oct16.chute.scaled.icc.4 +++ /dev/null @@ -1,90 +0,0 @@ -LAMMPS (6 Oct 2016) -# LAMMPS benchmark of granular flow -# chute flow of 32000 atoms with frozen base at 26 degrees - -variable x index 1 -variable y index 1 - -units lj -atom_style sphere -boundary p p fs -newton off -comm_modify vel yes - -read_data data.chute - orthogonal box = (0 0 0) to (40 20 37.2886) - 2 by 1 by 2 MPI processor grid - reading atoms ... - 32000 atoms - reading velocities ... - 32000 velocities - -replicate $x $y 1 -replicate 2 $y 1 -replicate 2 2 1 - orthogonal box = (0 0 0) to (80 40 37.2922) - 2 by 2 by 1 MPI processor grid - 128000 atoms - -pair_style gran/hooke/history 200000.0 NULL 50.0 NULL 0.5 0 -pair_coeff * * - -neighbor 0.1 bin -neigh_modify every 1 delay 0 - -timestep 0.0001 - -group bottom type 2 -3648 atoms in group bottom -group active subtract all bottom -124352 atoms in group active -neigh_modify exclude group bottom bottom - -fix 1 all gravity 1.0 chute 26.0 -fix 2 bottom freeze -fix 3 active nve/sphere - -compute 1 all erotate/sphere -thermo_style custom step atoms ke c_1 vol -thermo_modify norm no -thermo 100 - -run 100 -Neighbor list info ... - 2 neighbor list requests - update every 1 steps, delay 0 steps, check yes - max neighbors/atom: 2000, page size: 100000 - master list distance cutoff = 1.1 - ghost atom cutoff = 1.1 - binsize = 0.55 -> bins = 146 73 68 -Memory usage per processor = 16.1265 Mbytes -Step Atoms KinEng c_1 Volume - 0 128000 3136556.5 6404.5051 119335.13 - 100 128000 3137168.3 6284.3873 119338.83 -Loop time of 0.832365 on 4 procs for 100 steps with 128000 atoms - -Performance: 1038.006 tau/day, 120.140 timesteps/s -99.8% CPU use with 4 MPI tasks x no OpenMP threads - -MPI task timing breakdown: -Section | min time | avg time | max time |%varavg| %total ---------------------------------------------------------------- -Pair | 0.5178 | 0.52208 | 0.52793 | 0.5 | 62.72 -Neigh | 0.047003 | 0.047113 | 0.047224 | 0.0 | 5.66 -Comm | 0.05233 | 0.052988 | 0.053722 | 0.2 | 6.37 -Output | 0.00024986 | 0.00032717 | 0.00036693 | 0.3 | 0.04 -Modify | 0.15517 | 0.15627 | 0.15808 | 0.3 | 18.77 -Other | | 0.0536 | | | 6.44 - -Nlocal: 32000 ave 32000 max 32000 min -Histogram: 4 0 0 0 0 0 0 0 0 0 -Nghost: 5463 ave 5463 max 5463 min -Histogram: 4 0 0 0 0 0 0 0 0 0 -Neighs: 115133 ave 115133 max 115133 min -Histogram: 4 0 0 0 0 0 0 0 0 0 - -Total # of neighbors = 460532 -Ave neighs/atom = 3.59791 -Neighbor list builds = 2 -Dangerous builds = 0 -Total wall time: 0:00:00 diff --git a/bench/log.6Oct16.eam.fixed.icc.1 b/bench/log.6Oct16.eam.fixed.icc.1 deleted file mode 100644 index f5ddfcde0dc..00000000000 --- a/bench/log.6Oct16.eam.fixed.icc.1 +++ /dev/null @@ -1,83 +0,0 @@ -LAMMPS (6 Oct 2016) -# bulk Cu lattice - -variable x index 1 -variable y index 1 -variable z index 1 - -variable xx equal 20*$x -variable xx equal 20*1 -variable yy equal 20*$y -variable yy equal 20*1 -variable zz equal 20*$z -variable zz equal 20*1 - -units metal -atom_style atomic - -lattice fcc 3.615 -Lattice spacing in x,y,z = 3.615 3.615 3.615 -region box block 0 ${xx} 0 ${yy} 0 ${zz} -region box block 0 20 0 ${yy} 0 ${zz} -region box block 0 20 0 20 0 ${zz} -region box block 0 20 0 20 0 20 -create_box 1 box -Created orthogonal box = (0 0 0) to (72.3 72.3 72.3) - 1 by 1 by 1 MPI processor grid -create_atoms 1 box -Created 32000 atoms - -pair_style eam -pair_coeff 1 1 Cu_u3.eam -Reading potential file Cu_u3.eam with DATE: 2007-06-11 - -velocity all create 1600.0 376847 loop geom - -neighbor 1.0 bin -neigh_modify every 1 delay 5 check yes - -fix 1 all nve - -timestep 0.005 -thermo 50 - -run 100 -Neighbor list info ... - 1 neighbor list requests - update every 1 steps, delay 5 steps, check yes - max neighbors/atom: 2000, page size: 100000 - master list distance cutoff = 5.95 - ghost atom cutoff = 5.95 - binsize = 2.975 -> bins = 25 25 25 -Memory usage per processor = 11.2238 Mbytes -Step Temp E_pair E_mol TotEng Press - 0 1600 -113280 0 -106662.09 18703.573 - 50 781.69049 -109873.35 0 -106640.13 52273.088 - 100 801.832 -109957.3 0 -106640.77 51322.821 -Loop time of 5.96529 on 1 procs for 100 steps with 32000 atoms - -Performance: 7.242 ns/day, 3.314 hours/ns, 16.764 timesteps/s -99.9% CPU use with 1 MPI tasks x no OpenMP threads - -MPI task timing breakdown: -Section | min time | avg time | max time |%varavg| %total ---------------------------------------------------------------- -Pair | 5.2743 | 5.2743 | 5.2743 | 0.0 | 88.42 -Neigh | 0.59212 | 0.59212 | 0.59212 | 0.0 | 9.93 -Comm | 0.030399 | 0.030399 | 0.030399 | 0.0 | 0.51 -Output | 0.00026202 | 0.00026202 | 0.00026202 | 0.0 | 0.00 -Modify | 0.050487 | 0.050487 | 0.050487 | 0.0 | 0.85 -Other | | 0.01776 | | | 0.30 - -Nlocal: 32000 ave 32000 max 32000 min -Histogram: 1 0 0 0 0 0 0 0 0 0 -Nghost: 19909 ave 19909 max 19909 min -Histogram: 1 0 0 0 0 0 0 0 0 0 -Neighs: 1.20778e+06 ave 1.20778e+06 max 1.20778e+06 min -Histogram: 1 0 0 0 0 0 0 0 0 0 - -Total # of neighbors = 1207784 -Ave neighs/atom = 37.7433 -Neighbor list builds = 13 -Dangerous builds = 0 -Total wall time: 0:00:06 diff --git a/bench/log.6Oct16.eam.fixed.icc.4 b/bench/log.6Oct16.eam.fixed.icc.4 deleted file mode 100644 index 3414210acfd..00000000000 --- a/bench/log.6Oct16.eam.fixed.icc.4 +++ /dev/null @@ -1,83 +0,0 @@ -LAMMPS (6 Oct 2016) -# bulk Cu lattice - -variable x index 1 -variable y index 1 -variable z index 1 - -variable xx equal 20*$x -variable xx equal 20*1 -variable yy equal 20*$y -variable yy equal 20*1 -variable zz equal 20*$z -variable zz equal 20*1 - -units metal -atom_style atomic - -lattice fcc 3.615 -Lattice spacing in x,y,z = 3.615 3.615 3.615 -region box block 0 ${xx} 0 ${yy} 0 ${zz} -region box block 0 20 0 ${yy} 0 ${zz} -region box block 0 20 0 20 0 ${zz} -region box block 0 20 0 20 0 20 -create_box 1 box -Created orthogonal box = (0 0 0) to (72.3 72.3 72.3) - 1 by 2 by 2 MPI processor grid -create_atoms 1 box -Created 32000 atoms - -pair_style eam -pair_coeff 1 1 Cu_u3.eam -Reading potential file Cu_u3.eam with DATE: 2007-06-11 - -velocity all create 1600.0 376847 loop geom - -neighbor 1.0 bin -neigh_modify every 1 delay 5 check yes - -fix 1 all nve - -timestep 0.005 -thermo 50 - -run 100 -Neighbor list info ... - 1 neighbor list requests - update every 1 steps, delay 5 steps, check yes - max neighbors/atom: 2000, page size: 100000 - master list distance cutoff = 5.95 - ghost atom cutoff = 5.95 - binsize = 2.975 -> bins = 25 25 25 -Memory usage per processor = 5.59629 Mbytes -Step Temp E_pair E_mol TotEng Press - 0 1600 -113280 0 -106662.09 18703.573 - 50 781.69049 -109873.35 0 -106640.13 52273.088 - 100 801.832 -109957.3 0 -106640.77 51322.821 -Loop time of 1.64562 on 4 procs for 100 steps with 32000 atoms - -Performance: 26.252 ns/day, 0.914 hours/ns, 60.767 timesteps/s -99.8% CPU use with 4 MPI tasks x no OpenMP threads - -MPI task timing breakdown: -Section | min time | avg time | max time |%varavg| %total ---------------------------------------------------------------- -Pair | 1.408 | 1.4175 | 1.4341 | 0.9 | 86.14 -Neigh | 0.15512 | 0.15722 | 0.16112 | 0.6 | 9.55 -Comm | 0.029105 | 0.049986 | 0.061822 | 5.8 | 3.04 -Output | 0.00010991 | 0.00011539 | 0.00012302 | 0.0 | 0.01 -Modify | 0.013383 | 0.013573 | 0.013883 | 0.2 | 0.82 -Other | | 0.007264 | | | 0.44 - -Nlocal: 8000 ave 8008 max 7993 min -Histogram: 2 0 0 0 0 0 0 0 1 1 -Nghost: 9130.25 ave 9138 max 9122 min -Histogram: 2 0 0 0 0 0 0 0 0 2 -Neighs: 301946 ave 302392 max 301360 min -Histogram: 1 0 0 0 1 0 0 0 1 1 - -Total # of neighbors = 1207784 -Ave neighs/atom = 37.7433 -Neighbor list builds = 13 -Dangerous builds = 0 -Total wall time: 0:00:01 diff --git a/bench/log.6Oct16.eam.scaled.icc.4 b/bench/log.6Oct16.eam.scaled.icc.4 deleted file mode 100644 index 8a2ec90b783..00000000000 --- a/bench/log.6Oct16.eam.scaled.icc.4 +++ /dev/null @@ -1,83 +0,0 @@ -LAMMPS (6 Oct 2016) -# bulk Cu lattice - -variable x index 1 -variable y index 1 -variable z index 1 - -variable xx equal 20*$x -variable xx equal 20*2 -variable yy equal 20*$y -variable yy equal 20*2 -variable zz equal 20*$z -variable zz equal 20*1 - -units metal -atom_style atomic - -lattice fcc 3.615 -Lattice spacing in x,y,z = 3.615 3.615 3.615 -region box block 0 ${xx} 0 ${yy} 0 ${zz} -region box block 0 40 0 ${yy} 0 ${zz} -region box block 0 40 0 40 0 ${zz} -region box block 0 40 0 40 0 20 -create_box 1 box -Created orthogonal box = (0 0 0) to (144.6 144.6 72.3) - 2 by 2 by 1 MPI processor grid -create_atoms 1 box -Created 128000 atoms - -pair_style eam -pair_coeff 1 1 Cu_u3.eam -Reading potential file Cu_u3.eam with DATE: 2007-06-11 - -velocity all create 1600.0 376847 loop geom - -neighbor 1.0 bin -neigh_modify every 1 delay 5 check yes - -fix 1 all nve - -timestep 0.005 -thermo 50 - -run 100 -Neighbor list info ... - 1 neighbor list requests - update every 1 steps, delay 5 steps, check yes - max neighbors/atom: 2000, page size: 100000 - master list distance cutoff = 5.95 - ghost atom cutoff = 5.95 - binsize = 2.975 -> bins = 49 49 25 -Memory usage per processor = 11.1402 Mbytes -Step Temp E_pair E_mol TotEng Press - 0 1600 -453120 0 -426647.73 18704.012 - 50 779.50001 -439457.02 0 -426560.06 52355.276 - 100 797.97828 -439764.76 0 -426562.07 51474.74 -Loop time of 6.60121 on 4 procs for 100 steps with 128000 atoms - -Performance: 6.544 ns/day, 3.667 hours/ns, 15.149 timesteps/s -99.9% CPU use with 4 MPI tasks x no OpenMP threads - -MPI task timing breakdown: -Section | min time | avg time | max time |%varavg| %total ---------------------------------------------------------------- -Pair | 5.6676 | 5.7011 | 5.7469 | 1.3 | 86.36 -Neigh | 0.66423 | 0.67119 | 0.68082 | 0.7 | 10.17 -Comm | 0.079367 | 0.13668 | 0.1791 | 10.5 | 2.07 -Output | 0.00026989 | 0.00028622 | 0.00031209 | 0.1 | 0.00 -Modify | 0.060046 | 0.062203 | 0.065009 | 0.9 | 0.94 -Other | | 0.02974 | | | 0.45 - -Nlocal: 32000 ave 32092 max 31914 min -Histogram: 1 0 0 1 0 1 0 0 0 1 -Nghost: 19910 ave 19997 max 19818 min -Histogram: 1 0 0 0 1 0 1 0 0 1 -Neighs: 1.20728e+06 ave 1.21142e+06 max 1.2036e+06 min -Histogram: 1 0 0 1 1 0 0 0 0 1 - -Total # of neighbors = 4829126 -Ave neighs/atom = 37.7275 -Neighbor list builds = 14 -Dangerous builds = 0 -Total wall time: 0:00:06 diff --git a/bench/log.6Oct16.lj.fixed.icc.1 b/bench/log.6Oct16.lj.fixed.icc.1 deleted file mode 100644 index b08ca3b6b8d..00000000000 --- a/bench/log.6Oct16.lj.fixed.icc.1 +++ /dev/null @@ -1,79 +0,0 @@ -LAMMPS (6 Oct 2016) -# 3d Lennard-Jones melt - -variable x index 1 -variable y index 1 -variable z index 1 - -variable xx equal 20*$x -variable xx equal 20*1 -variable yy equal 20*$y -variable yy equal 20*1 -variable zz equal 20*$z -variable zz equal 20*1 - -units lj -atom_style atomic - -lattice fcc 0.8442 -Lattice spacing in x,y,z = 1.6796 1.6796 1.6796 -region box block 0 ${xx} 0 ${yy} 0 ${zz} -region box block 0 20 0 ${yy} 0 ${zz} -region box block 0 20 0 20 0 ${zz} -region box block 0 20 0 20 0 20 -create_box 1 box -Created orthogonal box = (0 0 0) to (33.5919 33.5919 33.5919) - 1 by 1 by 1 MPI processor grid -create_atoms 1 box -Created 32000 atoms -mass 1 1.0 - -velocity all create 1.44 87287 loop geom - -pair_style lj/cut 2.5 -pair_coeff 1 1 1.0 1.0 2.5 - -neighbor 0.3 bin -neigh_modify delay 0 every 20 check no - -fix 1 all nve - -run 100 -Neighbor list info ... - 1 neighbor list requests - update every 20 steps, delay 0 steps, check no - max neighbors/atom: 2000, page size: 100000 - master list distance cutoff = 2.8 - ghost atom cutoff = 2.8 - binsize = 1.4 -> bins = 24 24 24 -Memory usage per processor = 8.21387 Mbytes -Step Temp E_pair E_mol TotEng Press - 0 1.44 -6.7733681 0 -4.6134356 -5.0197073 - 100 0.7574531 -5.7585055 0 -4.6223613 0.20726105 -Loop time of 2.26185 on 1 procs for 100 steps with 32000 atoms - -Performance: 19099.377 tau/day, 44.212 timesteps/s -99.9% CPU use with 1 MPI tasks x no OpenMP threads - -MPI task timing breakdown: -Section | min time | avg time | max time |%varavg| %total ---------------------------------------------------------------- -Pair | 1.9328 | 1.9328 | 1.9328 | 0.0 | 85.45 -Neigh | 0.2558 | 0.2558 | 0.2558 | 0.0 | 11.31 -Comm | 0.024061 | 0.024061 | 0.024061 | 0.0 | 1.06 -Output | 0.00012612 | 0.00012612 | 0.00012612 | 0.0 | 0.01 -Modify | 0.040887 | 0.040887 | 0.040887 | 0.0 | 1.81 -Other | | 0.008214 | | | 0.36 - -Nlocal: 32000 ave 32000 max 32000 min -Histogram: 1 0 0 0 0 0 0 0 0 0 -Nghost: 19657 ave 19657 max 19657 min -Histogram: 1 0 0 0 0 0 0 0 0 0 -Neighs: 1.20283e+06 ave 1.20283e+06 max 1.20283e+06 min -Histogram: 1 0 0 0 0 0 0 0 0 0 - -Total # of neighbors = 1202833 -Ave neighs/atom = 37.5885 -Neighbor list builds = 5 -Dangerous builds not checked -Total wall time: 0:00:02 diff --git a/bench/log.6Oct16.lj.fixed.icc.4 b/bench/log.6Oct16.lj.fixed.icc.4 deleted file mode 100644 index 9eee300a94d..00000000000 --- a/bench/log.6Oct16.lj.fixed.icc.4 +++ /dev/null @@ -1,79 +0,0 @@ -LAMMPS (6 Oct 2016) -# 3d Lennard-Jones melt - -variable x index 1 -variable y index 1 -variable z index 1 - -variable xx equal 20*$x -variable xx equal 20*1 -variable yy equal 20*$y -variable yy equal 20*1 -variable zz equal 20*$z -variable zz equal 20*1 - -units lj -atom_style atomic - -lattice fcc 0.8442 -Lattice spacing in x,y,z = 1.6796 1.6796 1.6796 -region box block 0 ${xx} 0 ${yy} 0 ${zz} -region box block 0 20 0 ${yy} 0 ${zz} -region box block 0 20 0 20 0 ${zz} -region box block 0 20 0 20 0 20 -create_box 1 box -Created orthogonal box = (0 0 0) to (33.5919 33.5919 33.5919) - 1 by 2 by 2 MPI processor grid -create_atoms 1 box -Created 32000 atoms -mass 1 1.0 - -velocity all create 1.44 87287 loop geom - -pair_style lj/cut 2.5 -pair_coeff 1 1 1.0 1.0 2.5 - -neighbor 0.3 bin -neigh_modify delay 0 every 20 check no - -fix 1 all nve - -run 100 -Neighbor list info ... - 1 neighbor list requests - update every 20 steps, delay 0 steps, check no - max neighbors/atom: 2000, page size: 100000 - master list distance cutoff = 2.8 - ghost atom cutoff = 2.8 - binsize = 1.4 -> bins = 24 24 24 -Memory usage per processor = 4.09506 Mbytes -Step Temp E_pair E_mol TotEng Press - 0 1.44 -6.7733681 0 -4.6134356 -5.0197073 - 100 0.7574531 -5.7585055 0 -4.6223613 0.20726105 -Loop time of 0.635957 on 4 procs for 100 steps with 32000 atoms - -Performance: 67929.172 tau/day, 157.243 timesteps/s -99.9% CPU use with 4 MPI tasks x no OpenMP threads - -MPI task timing breakdown: -Section | min time | avg time | max time |%varavg| %total ---------------------------------------------------------------- -Pair | 0.51335 | 0.51822 | 0.52569 | 0.7 | 81.49 -Neigh | 0.063695 | 0.064309 | 0.065397 | 0.3 | 10.11 -Comm | 0.027525 | 0.03629 | 0.041959 | 3.1 | 5.71 -Output | 6.3896e-05 | 6.6698e-05 | 7.081e-05 | 0.0 | 0.01 -Modify | 0.012472 | 0.01254 | 0.012618 | 0.1 | 1.97 -Other | | 0.004529 | | | 0.71 - -Nlocal: 8000 ave 8037 max 7964 min -Histogram: 2 0 0 0 0 0 0 0 1 1 -Nghost: 9007.5 ave 9050 max 8968 min -Histogram: 1 1 0 0 0 0 0 1 0 1 -Neighs: 300708 ave 305113 max 297203 min -Histogram: 1 0 0 1 1 0 0 0 0 1 - -Total # of neighbors = 1202833 -Ave neighs/atom = 37.5885 -Neighbor list builds = 5 -Dangerous builds not checked -Total wall time: 0:00:00 diff --git a/bench/log.6Oct16.lj.scaled.icc.4 b/bench/log.6Oct16.lj.scaled.icc.4 deleted file mode 100644 index 4599879e596..00000000000 --- a/bench/log.6Oct16.lj.scaled.icc.4 +++ /dev/null @@ -1,79 +0,0 @@ -LAMMPS (6 Oct 2016) -# 3d Lennard-Jones melt - -variable x index 1 -variable y index 1 -variable z index 1 - -variable xx equal 20*$x -variable xx equal 20*2 -variable yy equal 20*$y -variable yy equal 20*2 -variable zz equal 20*$z -variable zz equal 20*1 - -units lj -atom_style atomic - -lattice fcc 0.8442 -Lattice spacing in x,y,z = 1.6796 1.6796 1.6796 -region box block 0 ${xx} 0 ${yy} 0 ${zz} -region box block 0 40 0 ${yy} 0 ${zz} -region box block 0 40 0 40 0 ${zz} -region box block 0 40 0 40 0 20 -create_box 1 box -Created orthogonal box = (0 0 0) to (67.1838 67.1838 33.5919) - 2 by 2 by 1 MPI processor grid -create_atoms 1 box -Created 128000 atoms -mass 1 1.0 - -velocity all create 1.44 87287 loop geom - -pair_style lj/cut 2.5 -pair_coeff 1 1 1.0 1.0 2.5 - -neighbor 0.3 bin -neigh_modify delay 0 every 20 check no - -fix 1 all nve - -run 100 -Neighbor list info ... - 1 neighbor list requests - update every 20 steps, delay 0 steps, check no - max neighbors/atom: 2000, page size: 100000 - master list distance cutoff = 2.8 - ghost atom cutoff = 2.8 - binsize = 1.4 -> bins = 48 48 24 -Memory usage per processor = 8.13678 Mbytes -Step Temp E_pair E_mol TotEng Press - 0 1.44 -6.7733681 0 -4.6133849 -5.0196788 - 100 0.75841891 -5.759957 0 -4.6223375 0.20008866 -Loop time of 2.55762 on 4 procs for 100 steps with 128000 atoms - -Performance: 16890.677 tau/day, 39.099 timesteps/s -99.8% CPU use with 4 MPI tasks x no OpenMP threads - -MPI task timing breakdown: -Section | min time | avg time | max time |%varavg| %total ---------------------------------------------------------------- -Pair | 2.0583 | 2.0988 | 2.1594 | 2.6 | 82.06 -Neigh | 0.24411 | 0.24838 | 0.25585 | 0.9 | 9.71 -Comm | 0.066397 | 0.13872 | 0.1863 | 11.9 | 5.42 -Output | 0.00012994 | 0.00021023 | 0.00025702 | 0.3 | 0.01 -Modify | 0.055533 | 0.058343 | 0.061791 | 1.2 | 2.28 -Other | | 0.0132 | | | 0.52 - -Nlocal: 32000 ave 32060 max 31939 min -Histogram: 1 0 1 0 0 0 0 1 0 1 -Nghost: 19630.8 ave 19681 max 19562 min -Histogram: 1 0 0 0 1 0 0 0 1 1 -Neighs: 1.20195e+06 ave 1.20354e+06 max 1.19931e+06 min -Histogram: 1 0 0 0 0 0 0 2 0 1 - -Total # of neighbors = 4807797 -Ave neighs/atom = 37.5609 -Neighbor list builds = 5 -Dangerous builds not checked -Total wall time: 0:00:02 diff --git a/bench/log.6Oct16.rhodo.fixed.icc.1 b/bench/log.6Oct16.rhodo.fixed.icc.1 deleted file mode 100644 index 65596d32850..00000000000 --- a/bench/log.6Oct16.rhodo.fixed.icc.1 +++ /dev/null @@ -1,122 +0,0 @@ -LAMMPS (6 Oct 2016) -# Rhodopsin model - -units real -neigh_modify delay 5 every 1 - -atom_style full -bond_style harmonic -angle_style charmm -dihedral_style charmm -improper_style harmonic -pair_style lj/charmm/coul/long 8.0 10.0 -pair_modify mix arithmetic -kspace_style pppm 1e-4 - -read_data data.rhodo - orthogonal box = (-27.5 -38.5 -36.3646) to (27.5 38.5 36.3615) - 1 by 1 by 1 MPI processor grid - reading atoms ... - 32000 atoms - reading velocities ... - 32000 velocities - scanning bonds ... - 4 = max bonds/atom - scanning angles ... - 8 = max angles/atom - scanning dihedrals ... - 18 = max dihedrals/atom - scanning impropers ... - 2 = max impropers/atom - reading bonds ... - 27723 bonds - reading angles ... - 40467 angles - reading dihedrals ... - 56829 dihedrals - reading impropers ... - 1034 impropers - 4 = max # of 1-2 neighbors - 12 = max # of 1-3 neighbors - 24 = max # of 1-4 neighbors - 26 = max # of special neighbors - -fix 1 all shake 0.0001 5 0 m 1.0 a 232 - 1617 = # of size 2 clusters - 3633 = # of size 3 clusters - 747 = # of size 4 clusters - 4233 = # of frozen angles -fix 2 all npt temp 300.0 300.0 100.0 z 0.0 0.0 1000.0 mtk no pchain 0 tchain 1 - -special_bonds charmm - -thermo 50 -thermo_style multi -timestep 2.0 - -run 100 -PPPM initialization ... -WARNING: Using 12-bit tables for long-range coulomb (../kspace.cpp:316) - G vector (1/distance) = 0.248835 - grid = 25 32 32 - stencil order = 5 - estimated absolute RMS force accuracy = 0.0355478 - estimated relative force accuracy = 0.000107051 - using double precision FFTs - 3d grid and FFT values/proc = 41070 25600 -Neighbor list info ... - 1 neighbor list requests - update every 1 steps, delay 5 steps, check yes - max neighbors/atom: 2000, page size: 100000 - master list distance cutoff = 12 - ghost atom cutoff = 12 - binsize = 6 -> bins = 10 13 13 -Memory usage per processor = 93.2721 Mbytes ----------------- Step 0 ----- CPU = 0.0000 (sec) ---------------- -TotEng = -25356.2064 KinEng = 21444.8313 Temp = 299.0397 -PotEng = -46801.0377 E_bond = 2537.9940 E_angle = 10921.3742 -E_dihed = 5211.7865 E_impro = 213.5116 E_vdwl = -2307.8634 -E_coul = 207025.8927 E_long = -270403.7333 Press = -149.3301 -Volume = 307995.0335 ----------------- Step 50 ----- CPU = 17.2007 (sec) ---------------- -TotEng = -25330.0321 KinEng = 21501.0036 Temp = 299.8230 -PotEng = -46831.0357 E_bond = 2471.7033 E_angle = 10836.5108 -E_dihed = 5239.6316 E_impro = 227.1219 E_vdwl = -1993.2763 -E_coul = 206797.6655 E_long = -270410.3927 Press = 237.6866 -Volume = 308031.5640 ----------------- Step 100 ----- CPU = 35.0315 (sec) ---------------- -TotEng = -25290.7387 KinEng = 21591.9096 Temp = 301.0906 -PotEng = -46882.6484 E_bond = 2567.9789 E_angle = 10781.9556 -E_dihed = 5198.7493 E_impro = 216.7863 E_vdwl = -1902.6458 -E_coul = 206659.5006 E_long = -270404.9733 Press = 6.7898 -Volume = 308133.9933 -Loop time of 35.0316 on 1 procs for 100 steps with 32000 atoms - -Performance: 0.493 ns/day, 48.655 hours/ns, 2.855 timesteps/s -99.9% CPU use with 1 MPI tasks x no OpenMP threads - -MPI task timing breakdown: -Section | min time | avg time | max time |%varavg| %total ---------------------------------------------------------------- -Pair | 25.021 | 25.021 | 25.021 | 0.0 | 71.42 -Bond | 1.2834 | 1.2834 | 1.2834 | 0.0 | 3.66 -Kspace | 3.2116 | 3.2116 | 3.2116 | 0.0 | 9.17 -Neigh | 4.2767 | 4.2767 | 4.2767 | 0.0 | 12.21 -Comm | 0.069283 | 0.069283 | 0.069283 | 0.0 | 0.20 -Output | 0.00028205 | 0.00028205 | 0.00028205 | 0.0 | 0.00 -Modify | 1.14 | 1.14 | 1.14 | 0.0 | 3.25 -Other | | 0.02938 | | | 0.08 - -Nlocal: 32000 ave 32000 max 32000 min -Histogram: 1 0 0 0 0 0 0 0 0 0 -Nghost: 47958 ave 47958 max 47958 min -Histogram: 1 0 0 0 0 0 0 0 0 0 -Neighs: 1.20281e+07 ave 1.20281e+07 max 1.20281e+07 min -Histogram: 1 0 0 0 0 0 0 0 0 0 - -Total # of neighbors = 12028098 -Ave neighs/atom = 375.878 -Ave special neighs/atom = 7.43187 -Neighbor list builds = 11 -Dangerous builds = 0 -Total wall time: 0:00:36 diff --git a/bench/log.6Oct16.rhodo.fixed.icc.4 b/bench/log.6Oct16.rhodo.fixed.icc.4 deleted file mode 100644 index 50526063f11..00000000000 --- a/bench/log.6Oct16.rhodo.fixed.icc.4 +++ /dev/null @@ -1,122 +0,0 @@ -LAMMPS (6 Oct 2016) -# Rhodopsin model - -units real -neigh_modify delay 5 every 1 - -atom_style full -bond_style harmonic -angle_style charmm -dihedral_style charmm -improper_style harmonic -pair_style lj/charmm/coul/long 8.0 10.0 -pair_modify mix arithmetic -kspace_style pppm 1e-4 - -read_data data.rhodo - orthogonal box = (-27.5 -38.5 -36.3646) to (27.5 38.5 36.3615) - 1 by 2 by 2 MPI processor grid - reading atoms ... - 32000 atoms - reading velocities ... - 32000 velocities - scanning bonds ... - 4 = max bonds/atom - scanning angles ... - 8 = max angles/atom - scanning dihedrals ... - 18 = max dihedrals/atom - scanning impropers ... - 2 = max impropers/atom - reading bonds ... - 27723 bonds - reading angles ... - 40467 angles - reading dihedrals ... - 56829 dihedrals - reading impropers ... - 1034 impropers - 4 = max # of 1-2 neighbors - 12 = max # of 1-3 neighbors - 24 = max # of 1-4 neighbors - 26 = max # of special neighbors - -fix 1 all shake 0.0001 5 0 m 1.0 a 232 - 1617 = # of size 2 clusters - 3633 = # of size 3 clusters - 747 = # of size 4 clusters - 4233 = # of frozen angles -fix 2 all npt temp 300.0 300.0 100.0 z 0.0 0.0 1000.0 mtk no pchain 0 tchain 1 - -special_bonds charmm - -thermo 50 -thermo_style multi -timestep 2.0 - -run 100 -PPPM initialization ... -WARNING: Using 12-bit tables for long-range coulomb (../kspace.cpp:316) - G vector (1/distance) = 0.248835 - grid = 25 32 32 - stencil order = 5 - estimated absolute RMS force accuracy = 0.0355478 - estimated relative force accuracy = 0.000107051 - using double precision FFTs - 3d grid and FFT values/proc = 13230 6400 -Neighbor list info ... - 1 neighbor list requests - update every 1 steps, delay 5 steps, check yes - max neighbors/atom: 2000, page size: 100000 - master list distance cutoff = 12 - ghost atom cutoff = 12 - binsize = 6 -> bins = 10 13 13 -Memory usage per processor = 37.3604 Mbytes ----------------- Step 0 ----- CPU = 0.0000 (sec) ---------------- -TotEng = -25356.2064 KinEng = 21444.8313 Temp = 299.0397 -PotEng = -46801.0377 E_bond = 2537.9940 E_angle = 10921.3742 -E_dihed = 5211.7865 E_impro = 213.5116 E_vdwl = -2307.8634 -E_coul = 207025.8927 E_long = -270403.7333 Press = -149.3301 -Volume = 307995.0335 ----------------- Step 50 ----- CPU = 4.6056 (sec) ---------------- -TotEng = -25330.0321 KinEng = 21501.0036 Temp = 299.8230 -PotEng = -46831.0357 E_bond = 2471.7033 E_angle = 10836.5108 -E_dihed = 5239.6316 E_impro = 227.1219 E_vdwl = -1993.2763 -E_coul = 206797.6655 E_long = -270410.3927 Press = 237.6866 -Volume = 308031.5640 ----------------- Step 100 ----- CPU = 9.3910 (sec) ---------------- -TotEng = -25290.7386 KinEng = 21591.9096 Temp = 301.0906 -PotEng = -46882.6482 E_bond = 2567.9789 E_angle = 10781.9556 -E_dihed = 5198.7493 E_impro = 216.7863 E_vdwl = -1902.6458 -E_coul = 206659.5007 E_long = -270404.9733 Press = 6.7898 -Volume = 308133.9933 -Loop time of 9.39107 on 4 procs for 100 steps with 32000 atoms - -Performance: 1.840 ns/day, 13.043 hours/ns, 10.648 timesteps/s -99.8% CPU use with 4 MPI tasks x no OpenMP threads - -MPI task timing breakdown: -Section | min time | avg time | max time |%varavg| %total ---------------------------------------------------------------- -Pair | 6.2189 | 6.3266 | 6.6072 | 6.5 | 67.37 -Bond | 0.30793 | 0.32122 | 0.3414 | 2.4 | 3.42 -Kspace | 0.87994 | 1.1644 | 1.2855 | 15.3 | 12.40 -Neigh | 1.1358 | 1.136 | 1.1362 | 0.0 | 12.10 -Comm | 0.08292 | 0.084935 | 0.087077 | 0.5 | 0.90 -Output | 0.00015712 | 0.00016558 | 0.00018501 | 0.1 | 0.00 -Modify | 0.33717 | 0.34246 | 0.34794 | 0.7 | 3.65 -Other | | 0.01526 | | | 0.16 - -Nlocal: 8000 ave 8143 max 7933 min -Histogram: 1 2 0 0 0 0 0 0 0 1 -Nghost: 22733.5 ave 22769 max 22693 min -Histogram: 1 0 0 0 0 2 0 0 0 1 -Neighs: 3.00702e+06 ave 3.0975e+06 max 2.96492e+06 min -Histogram: 1 2 0 0 0 0 0 0 0 1 - -Total # of neighbors = 12028098 -Ave neighs/atom = 375.878 -Ave special neighs/atom = 7.43187 -Neighbor list builds = 11 -Dangerous builds = 0 -Total wall time: 0:00:09 diff --git a/bench/log.6Oct16.rhodo.scaled.icc.4 b/bench/log.6Oct16.rhodo.scaled.icc.4 deleted file mode 100644 index db445ca72c4..00000000000 --- a/bench/log.6Oct16.rhodo.scaled.icc.4 +++ /dev/null @@ -1,143 +0,0 @@ -LAMMPS (6 Oct 2016) -# Rhodopsin model - -variable x index 1 -variable y index 1 -variable z index 1 - -units real -neigh_modify delay 5 every 1 - -atom_style full -atom_modify map hash -bond_style harmonic -angle_style charmm -dihedral_style charmm -improper_style harmonic -pair_style lj/charmm/coul/long 8.0 10.0 -pair_modify mix arithmetic -kspace_style pppm 1e-4 - -read_data data.rhodo - orthogonal box = (-27.5 -38.5 -36.3646) to (27.5 38.5 36.3615) - 1 by 2 by 2 MPI processor grid - reading atoms ... - 32000 atoms - reading velocities ... - 32000 velocities - scanning bonds ... - 4 = max bonds/atom - scanning angles ... - 8 = max angles/atom - scanning dihedrals ... - 18 = max dihedrals/atom - scanning impropers ... - 2 = max impropers/atom - reading bonds ... - 27723 bonds - reading angles ... - 40467 angles - reading dihedrals ... - 56829 dihedrals - reading impropers ... - 1034 impropers - 4 = max # of 1-2 neighbors - 12 = max # of 1-3 neighbors - 24 = max # of 1-4 neighbors - 26 = max # of special neighbors - -replicate $x $y $z -replicate 2 $y $z -replicate 2 2 $z -replicate 2 2 1 - orthogonal box = (-27.5 -38.5 -36.3646) to (82.5 115.5 36.3615) - 2 by 2 by 1 MPI processor grid - 128000 atoms - 110892 bonds - 161868 angles - 227316 dihedrals - 4136 impropers - 4 = max # of 1-2 neighbors - 12 = max # of 1-3 neighbors - 24 = max # of 1-4 neighbors - 26 = max # of special neighbors - -fix 1 all shake 0.0001 5 0 m 1.0 a 232 - 6468 = # of size 2 clusters - 14532 = # of size 3 clusters - 2988 = # of size 4 clusters - 16932 = # of frozen angles -fix 2 all npt temp 300.0 300.0 100.0 z 0.0 0.0 1000.0 mtk no pchain 0 tchain 1 - -special_bonds charmm - -thermo 50 -thermo_style multi -timestep 2.0 - -run 100 -PPPM initialization ... -WARNING: Using 12-bit tables for long-range coulomb (../kspace.cpp:316) - G vector (1/distance) = 0.248593 - grid = 48 60 36 - stencil order = 5 - estimated absolute RMS force accuracy = 0.0359793 - estimated relative force accuracy = 0.00010835 - using double precision FFTs - 3d grid and FFT values/proc = 41615 25920 -Neighbor list info ... - 1 neighbor list requests - update every 1 steps, delay 5 steps, check yes - max neighbors/atom: 2000, page size: 100000 - master list distance cutoff = 12 - ghost atom cutoff = 12 - binsize = 6 -> bins = 19 26 13 -Memory usage per processor = 96.9597 Mbytes ----------------- Step 0 ----- CPU = 0.0000 (sec) ---------------- -TotEng = -101425.4887 KinEng = 85779.3251 Temp = 299.0304 -PotEng = -187204.8138 E_bond = 10151.9760 E_angle = 43685.4968 -E_dihed = 20847.1460 E_impro = 854.0463 E_vdwl = -9231.4537 -E_coul = 827053.5824 E_long = -1080565.6077 Press = -149.0358 -Volume = 1231980.1340 ----------------- Step 50 ----- CPU = 18.1689 (sec) ---------------- -TotEng = -101320.0211 KinEng = 86003.4933 Temp = 299.8118 -PotEng = -187323.5144 E_bond = 9887.1189 E_angle = 43346.8448 -E_dihed = 20958.7108 E_impro = 908.4721 E_vdwl = -7973.4486 -E_coul = 826141.5493 E_long = -1080592.7617 Press = 238.0404 -Volume = 1232126.1814 ----------------- Step 100 ----- CPU = 37.2027 (sec) ---------------- -TotEng = -101157.9546 KinEng = 86355.7413 Temp = 301.0398 -PotEng = -187513.6959 E_bond = 10272.0456 E_angle = 43128.7018 -E_dihed = 20794.0107 E_impro = 867.0928 E_vdwl = -7587.2409 -E_coul = 825584.2416 E_long = -1080572.5474 Press = 15.1729 -Volume = 1232535.8440 -Loop time of 37.2028 on 4 procs for 100 steps with 128000 atoms - -Performance: 0.464 ns/day, 51.671 hours/ns, 2.688 timesteps/s -99.9% CPU use with 4 MPI tasks x no OpenMP threads - -MPI task timing breakdown: -Section | min time | avg time | max time |%varavg| %total ---------------------------------------------------------------- -Pair | 25.431 | 25.738 | 25.984 | 4.0 | 69.18 -Bond | 1.2966 | 1.3131 | 1.3226 | 0.9 | 3.53 -Kspace | 3.7563 | 4.0123 | 4.3127 | 10.0 | 10.79 -Neigh | 4.3778 | 4.378 | 4.3782 | 0.0 | 11.77 -Comm | 0.1903 | 0.19549 | 0.20485 | 1.3 | 0.53 -Output | 0.00031805 | 0.00037521 | 0.00039601 | 0.2 | 0.00 -Modify | 1.4861 | 1.5051 | 1.5122 | 0.9 | 4.05 -Other | | 0.05992 | | | 0.16 - -Nlocal: 32000 ave 32000 max 32000 min -Histogram: 4 0 0 0 0 0 0 0 0 0 -Nghost: 47957 ave 47957 max 47957 min -Histogram: 4 0 0 0 0 0 0 0 0 0 -Neighs: 1.20281e+07 ave 1.20572e+07 max 1.19991e+07 min -Histogram: 2 0 0 0 0 0 0 0 0 2 - -Total # of neighbors = 48112540 -Ave neighs/atom = 375.879 -Ave special neighs/atom = 7.43187 -Neighbor list builds = 11 -Dangerous builds = 0 -Total wall time: 0:00:38 diff --git a/cmake/CMakeLists.txt b/cmake/CMakeLists.txt index 954bd7b4ac3..da65dbb5ff8 100644 --- a/cmake/CMakeLists.txt +++ b/cmake/CMakeLists.txt @@ -3,9 +3,6 @@ # CMake build system # This file is part of LAMMPS cmake_minimum_required(VERSION 3.16) -if(CMAKE_VERSION VERSION_LESS 3.20) - message(WARNING "LAMMPS is planning to require at least CMake version 3.20 by Summer 2025. Please upgrade!") -endif() ######################################## # initialize version variables with project command if(POLICY CMP0048) @@ -156,9 +153,6 @@ endif() if(CMAKE_CXX_STANDARD LESS 11) message(FATAL_ERROR "C++ standard must be set to at least 11") endif() -if(CMAKE_CXX_STANDARD LESS 17) - message(WARNING "Selecting C++17 standard is preferred over C++${CMAKE_CXX_STANDARD}") -endif() if(PKG_KOKKOS AND (CMAKE_CXX_STANDARD LESS 17)) set(CMAKE_CXX_STANDARD 17) endif() @@ -244,15 +238,6 @@ option(CMAKE_POSITION_INDEPENDENT_CODE "Create object compatible with shared lib option(BUILD_TOOLS "Build and install LAMMPS tools (msi2lmp, binary2txt, chain)" OFF) option(BUILD_LAMMPS_GUI "Build and install the LAMMPS GUI" OFF) -# Support using clang-tidy for C++ files with selected options -set(ENABLE_CLANG_TIDY OFF CACHE BOOL "Include clang-tidy processing when compiling") -if(ENABLE_CLANG_TIDY) - set(CMAKE_CXX_CLANG_TIDY "clang-tidy;-checks=-*,performance-trivially-destructible,performance-unnecessary-copy-initialization,performance-unnecessary-value-param,readability-redundant-control-flow,readability-redundant-declaration,readability-redundant-function-ptr-dereference,readability-redundant-member-init,readability-redundant-string-cstr,readability-redundant-string-init,readability-simplify-boolean-expr,readability-static-accessed-through-instance,readability-static-definition-in-anonymous-namespace,readability-qualified-auto,misc-unused-parameters,modernize-deprecated-ios-base-aliases,modernize-loop-convert,modernize-shrink-to-fit,modernize-use-auto,modernize-use-using,modernize-use-override,modernize-use-bool-literals,modernize-use-emplace,modernize-return-braced-init-list,modernize-use-equals-default,modernize-use-equals-delete,modernize-replace-random-shuffle,modernize-deprecated-headers,modernize-use-nullptr,modernize-use-noexcept,modernize-redundant-void-arg;-fix;-header-filter=.*,header-filter=library.h,header-filter=fmt/*.h" CACHE STRING "clang-tidy settings") -else() - unset(CMAKE_CXX_CLANG_TIDY CACHE) -endif() - - file(GLOB ALL_SOURCES CONFIGURE_DEPENDS ${LAMMPS_SOURCE_DIR}/[^.]*.cpp) file(GLOB MAIN_SOURCES CONFIGURE_DEPENDS ${LAMMPS_SOURCE_DIR}/main.cpp) list(REMOVE_ITEM ALL_SOURCES ${MAIN_SOURCES}) @@ -277,6 +262,7 @@ option(CMAKE_VERBOSE_MAKEFILE "Generate verbose Makefiles" OFF) set(STANDARD_PACKAGES ADIOS AMOEBA + APIP ASPHERE ATC AWPMD @@ -369,17 +355,6 @@ foreach(PKG ${STANDARD_PACKAGES} ${SUFFIX_PACKAGES}) option(PKG_${PKG} "Build ${PKG} Package" OFF) endforeach() -set(DEPRECATED_PACKAGES AWPMD ATC POEMS) -foreach(PKG ${DEPRECATED_PACKAGES}) - if(PKG_${PKG}) - message(WARNING - "The ${PKG} package will be removed from LAMMPS in Summer 2025 due to lack of " - "maintenance and use of code constructs that conflict with modern C++ compilers " - "and standards. Please contact developers@lammps.org if you have any concerns " - "about this step.") - endif() -endforeach() - ###################################################### # packages with special compiler needs or external libs ###################################################### @@ -476,6 +451,7 @@ pkg_depends(ELECTRODE KSPACE) pkg_depends(EXTRA-MOLECULE MOLECULE) pkg_depends(MESONT MOLECULE) pkg_depends(RHEO BPM) +pkg_depends(APIP ML-PACE) # detect if we may enable OpenMP support by default set(BUILD_OMP_DEFAULT OFF) diff --git a/cmake/Modules/CodeCoverage.cmake b/cmake/Modules/CodeCoverage.cmake index 885b5cba6d3..530a3c6366d 100644 --- a/cmake/Modules/CodeCoverage.cmake +++ b/cmake/Modules/CodeCoverage.cmake @@ -30,7 +30,7 @@ if(ENABLE_COVERAGE) add_custom_target( gen_coverage_html - COMMAND ${GCOVR_BINARY} -s --html --html-details -r ${ABSOLUTE_LAMMPS_SOURCE_DIR} --object-directory=${CMAKE_BINARY_DIR} -o ${COVERAGE_HTML_DIR}/index.html + COMMAND ${GCOVR_BINARY} -s --html --html-nested --html-self-contained -r ${ABSOLUTE_LAMMPS_SOURCE_DIR} --object-directory=${CMAKE_BINARY_DIR} -o ${COVERAGE_HTML_DIR}/index.html WORKING_DIRECTORY ${CMAKE_BINARY_DIR} COMMENT "Generating HTML coverage report..." ) diff --git a/cmake/Modules/Packages/KOKKOS.cmake b/cmake/Modules/Packages/KOKKOS.cmake index f878db654cc..4bd97e62b1e 100644 --- a/cmake/Modules/Packages/KOKKOS.cmake +++ b/cmake/Modules/Packages/KOKKOS.cmake @@ -57,8 +57,8 @@ if(DOWNLOAD_KOKKOS) list(APPEND KOKKOS_LIB_BUILD_ARGS "-DCMAKE_CXX_EXTENSIONS=${CMAKE_CXX_EXTENSIONS}") list(APPEND KOKKOS_LIB_BUILD_ARGS "-DCMAKE_TOOLCHAIN_FILE=${CMAKE_TOOLCHAIN_FILE}") include(ExternalProject) - set(KOKKOS_URL "https://github.com/kokkos/kokkos/archive/4.6.00.tar.gz" CACHE STRING "URL for KOKKOS tarball") - set(KOKKOS_MD5 "61b2b69ae50d83eedcc7d47a3fa3d6cb" CACHE STRING "MD5 checksum of KOKKOS tarball") + set(KOKKOS_URL "https://github.com/kokkos/kokkos/archive/4.6.02.tar.gz" CACHE STRING "URL for KOKKOS tarball") + set(KOKKOS_MD5 "14c02fac07bfcec48a1654f88ddee9c6" CACHE STRING "MD5 checksum of KOKKOS tarball") mark_as_advanced(KOKKOS_URL) mark_as_advanced(KOKKOS_MD5) GetFallbackURL(KOKKOS_URL KOKKOS_FALLBACK) @@ -83,7 +83,7 @@ if(DOWNLOAD_KOKKOS) add_dependencies(LAMMPS::KOKKOSCORE kokkos_build) add_dependencies(LAMMPS::KOKKOSCONTAINERS kokkos_build) elseif(EXTERNAL_KOKKOS) - find_package(Kokkos 4.6.00 REQUIRED CONFIG) + find_package(Kokkos 4.6.02 REQUIRED CONFIG) target_link_libraries(lammps PRIVATE Kokkos::kokkos) else() set(LAMMPS_LIB_KOKKOS_SRC_DIR ${LAMMPS_LIB_SOURCE_DIR}/kokkos) diff --git a/cmake/Modules/Packages/MC.cmake b/cmake/Modules/Packages/MC.cmake index 2a72a895cf8..a39a630da3f 100644 --- a/cmake/Modules/Packages/MC.cmake +++ b/cmake/Modules/Packages/MC.cmake @@ -8,6 +8,16 @@ if(NOT PKG_MANYBODY) set_property(TARGET lammps PROPERTY SOURCES "${LAMMPS_SOURCES}") endif() +# fix hmc may only be installed if also fix rigid/small from RIGID is installed +if(NOT PKG_RIGID) + get_property(LAMMPS_FIX_HEADERS GLOBAL PROPERTY FIX) + list(REMOVE_ITEM LAMMPS_FIX_HEADERS ${LAMMPS_SOURCE_DIR}/MC/fix_hmc.h) + set_property(GLOBAL PROPERTY FIX "${LAMMPS_FIX_HEADERS}") + get_target_property(LAMMPS_SOURCES lammps SOURCES) + list(REMOVE_ITEM LAMMPS_SOURCES ${LAMMPS_SOURCE_DIR}/MC/fix_hmc.cpp) + set_property(TARGET lammps PROPERTY SOURCES "${LAMMPS_SOURCES}") +endif() + # fix neighbor/swap may only be installed if also the VORONOI package is installed if(NOT PKG_VORONOI) get_property(LAMMPS_FIX_HEADERS GLOBAL PROPERTY FIX) diff --git a/cmake/Modules/Packages/ML-PACE.cmake b/cmake/Modules/Packages/ML-PACE.cmake index b30c61b8e46..7d3d1a452e6 100644 --- a/cmake/Modules/Packages/ML-PACE.cmake +++ b/cmake/Modules/Packages/ML-PACE.cmake @@ -53,7 +53,13 @@ else() add_library(yaml-cpp::yaml-cpp ALIAS yaml-cpp) endif() - add_subdirectory(${lib-pace} build-pace) + # fixup yaml-cpp/emitterutils.cpp for GCC 15+ until patch is applied + file(READ ${lib-pace}/yaml-cpp/src/emitterutils.cpp yaml_emitterutils) + string(REPLACE "#include " "#include \n#include " yaml_tmp_emitterutils "${yaml_emitterutils}") + string(REPLACE "#include \n#include " "#include " yaml_emitterutils "${yaml_tmp_emitterutils}") + file(WRITE ${lib-pace}/yaml-cpp/src/emitterutils.cpp "${yaml_emitterutils}") + + add_subdirectory(${lib-pace} build-pace EXCLUDE_FROM_ALL) set_target_properties(pace PROPERTIES CXX_EXTENSIONS ON OUTPUT_NAME lammps_pace${LAMMPS_MACHINE}) if(CMAKE_PROJECT_NAME STREQUAL "lammps") diff --git a/cmake/packaging/linux_wrapper.sh b/cmake/packaging/linux_wrapper.sh index b777c09eb12..44c9f814273 100755 --- a/cmake/packaging/linux_wrapper.sh +++ b/cmake/packaging/linux_wrapper.sh @@ -7,6 +7,11 @@ export LC_ALL=C BASEDIR="$(dirname "$0")" EXENAME="$(basename "$0")" +# save old settings (for restoring them later) +OLDPATH="${PATH}" +OLDLDLIB="${LD_LIBRARY_PATH}" + +# prepend path to find our custom executables PATH="${BASEDIR}/bin:${PATH}" # append to LD_LIBRARY_PATH to prefer local (newer) libs @@ -15,6 +20,8 @@ LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${BASEDIR}/lib" # set some environment variables for LAMMPS etc. LAMMPS_POTENTIALS="${BASEDIR}/share/lammps/potentials" MSI2LMP_LIBRARY="${BASEDIR}/share/lammps/frc_files" -export LD_LIBRARY_PATH LAMMPS_POTENTIALS MSI2LMP_LIBRARY PATH + +# export everything +export LD_LIBRARY_PATH LAMMPS_POTENTIALS MSI2LMP_LIBRARY PATH OLDPATH OLDLDLIB exec "${BASEDIR}/bin/${EXENAME}" "$@" diff --git a/cmake/packaging/xdg-open b/cmake/packaging/xdg-open index d282bb3d11a..298919a44ab 100755 --- a/cmake/packaging/xdg-open +++ b/cmake/packaging/xdg-open @@ -33,6 +33,14 @@ # #--------------------------------------------- +# restore previously saved environment variables, if available +if [ -n "${OLDPATH}" ] +then + PATH="${OLDPATH}" + LD_LIBRARY_PATH="${OLDLDLIB}" + export PATH LD_LIBRARY_PATH +fi + NEW_LIBRARY_PATH="/usr/local/lib64" for s in $(echo $LD_LIBRARY_PATH | sed -e 's/:/ /g') do \ diff --git a/cmake/presets/all_off.cmake b/cmake/presets/all_off.cmake index f2f57824804..9c76e892fe8 100644 --- a/cmake/presets/all_off.cmake +++ b/cmake/presets/all_off.cmake @@ -4,6 +4,7 @@ set(ALL_PACKAGES ADIOS AMOEBA + APIP ASPHERE ATC AWPMD diff --git a/cmake/presets/all_on.cmake b/cmake/presets/all_on.cmake index 8dc4632138f..ba9474840af 100644 --- a/cmake/presets/all_on.cmake +++ b/cmake/presets/all_on.cmake @@ -6,6 +6,7 @@ set(ALL_PACKAGES ADIOS AMOEBA + APIP ASPHERE ATC AWPMD diff --git a/cmake/presets/clang.cmake b/cmake/presets/clang.cmake index f55c5be44a9..a5d0fc98207 100644 --- a/cmake/presets/clang.cmake +++ b/cmake/presets/clang.cmake @@ -14,14 +14,14 @@ endif() set(CMAKE_CXX_COMPILER "clang++" CACHE STRING "" FORCE) set(CMAKE_C_COMPILER "clang" CACHE STRING "" FORCE) set(CMAKE_Fortran_COMPILER ${CLANG_FORTRAN} CACHE STRING "" FORCE) -set(CMAKE_CXX_FLAGS_DEBUG "-Wall -Wextra -g" CACHE STRING "" FORCE) -set(CMAKE_CXX_FLAGS_RELWITHDEBINFO "-Wall -Wextra -g -O2 -DNDEBUG" CACHE STRING "" FORCE) +set(CMAKE_CXX_FLAGS_DEBUG "-Wall -Wextra -Wno-bitwise-instead-of-logical -g" CACHE STRING "" FORCE) +set(CMAKE_CXX_FLAGS_RELWITHDEBINFO "-Wall -Wextra -Wno-bitwise-instead-of-logical -g -O2 -DNDEBUG" CACHE STRING "" FORCE) set(CMAKE_CXX_FLAGS_RELEASE "-O3 -DNDEBUG" CACHE STRING "" FORCE) set(CMAKE_Fortran_FLAGS_DEBUG "-Wall -Wextra -g ${FC_STD_VERSION}" CACHE STRING "" FORCE) set(CMAKE_Fortran_FLAGS_RELWITHDEBINFO "-Wall -Wextra -g -O2 -DNDEBUG ${FC_STD_VERSION}" CACHE STRING "" FORCE) set(CMAKE_Fortran_FLAGS_RELEASE "-O3 -DNDEBUG ${FC_STD_VERSION}" CACHE STRING "" FORCE) -set(CMAKE_C_FLAGS_DEBUG "-Wall -Wextra -g" CACHE STRING "" FORCE) -set(CMAKE_C_FLAGS_RELWITHDEBINFO "-Wall -Wextra -g -O2 -DNDEBUG" CACHE STRING "" FORCE) +set(CMAKE_C_FLAGS_DEBUG "-Wall -Wextra -Wno-bitwise-instead-of-logical -g" CACHE STRING "" FORCE) +set(CMAKE_C_FLAGS_RELWITHDEBINFO "-Wall -Wextra -Wno-bitwise-instead-of-logical -g -O2 -DNDEBUG" CACHE STRING "" FORCE) set(CMAKE_C_FLAGS_RELEASE "-O3 -DNDEBUG" CACHE STRING "" FORCE) set(MPI_CXX "clang++" CACHE STRING "" FORCE) diff --git a/cmake/presets/nolib.cmake b/cmake/presets/nolib.cmake index 4a4a5575055..269aed33ed8 100644 --- a/cmake/presets/nolib.cmake +++ b/cmake/presets/nolib.cmake @@ -3,6 +3,7 @@ set(PACKAGES_WITH_LIB ADIOS + APIP ATC AWPMD COMPRESS diff --git a/doc/.gitignore b/doc/.gitignore index 28e583fa0b6..58a6b9129bf 100644 --- a/doc/.gitignore +++ b/doc/.gitignore @@ -17,6 +17,7 @@ *.el /utils/sphinx-config/_static/mathjax /utils/sphinx-config/_static/polyfill.js +/utils/sphinx-config/_themes/lammps_theme/search.html /src/pairs.rst /src/bonds.rst /src/angles.rst diff --git a/doc/Makefile b/doc/Makefile index 92132e7d8cc..58800ec633a 100644 --- a/doc/Makefile +++ b/doc/Makefile @@ -22,6 +22,7 @@ HAS_PYTHON3 = NO HAS_DOXYGEN = NO HAS_PDFLATEX = NO HAS_PANDOC = NO +WEB_SEARCH = NO ifeq ($(shell type python3 >/dev/null 2>&1; echo $$?), 0) HAS_PYTHON3 = YES @@ -95,12 +96,17 @@ globbed-tocs: html: xmlgen globbed-tocs $(VENV) $(SPHINXCONFIG)/conf.py $(ANCHORCHECK) $(MATHJAX) @if [ "$(HAS_BASH)" == "NO" ] ; then echo "bash was not found at $(OSHELL)! Please use: $(MAKE) SHELL=/path/to/bash" 1>&2; exit 1; fi + @if [ "$(WEB_SEARCH)" == "YES" ] ; then \ + cp -v utils/sphinx-config/_themes/lammps_theme/google_search.html \ + utils/sphinx-config/_themes/lammps_theme/search.html; \ + else \ + cp -v utils/sphinx-config/_themes/lammps_theme/local_search.html \ + utils/sphinx-config/_themes/lammps_theme/search.html; \ + fi @$(MAKE) $(MFLAGS) -C graphviz all @(\ . $(VENV)/bin/activate ; env PYTHONWARNINGS= PYTHONDONTWRITEBYTECODE=1 \ sphinx-build -E $(SPHINXEXTRA) -b html -c $(SPHINXCONFIG) -d $(BUILDDIR)/doctrees $(RSTDIR) html ;\ - touch $(RSTDIR)/Fortran.rst ; env PYTHONWARNINGS= PYTHONDONTWRITEBYTECODE=1 \ - sphinx-build $(SPHINXEXTRA) -b html -c $(SPHINXCONFIG) -d $(BUILDDIR)/doctrees $(RSTDIR) html ;\ ln -sf Manual.html html/index.html;\ rm -f $(BUILDDIR)/doxygen/xml/run.stamp;\ echo "############################################" ; env PYTHONWARNINGS= PYTHONDONTWRITEBYTECODE=1 \ @@ -162,8 +168,6 @@ epub: xmlgen globbed-tocs $(VENV) $(SPHINXCONFIG)/conf.py $(ANCHORCHECK) @(\ . $(VENV)/bin/activate ; env PYTHONWARNINGS= PYTHONDONTWRITEBYTECODE=1 \ sphinx-build -E $(SPHINXEXTRA) -b epub -c $(SPHINXCONFIG) -d $(BUILDDIR)/doctrees $(RSTDIR) epub ;\ - touch $(RSTDIR)/Fortran.rst ; env PYTHONWARNINGS= PYTHONDONTWRITEBYTECODE=1 \ - sphinx-build $(SPHINXEXTRA) -b epub -c $(SPHINXCONFIG) -d $(BUILDDIR)/doctrees $(RSTDIR) epub ;\ rm -f $(BUILDDIR)/doxygen/xml/run.stamp;\ deactivate ;\ ) @@ -183,8 +187,6 @@ pdf: xmlgen globbed-tocs $(VENV) $(SPHINXCONFIG)/conf.py $(ANCHORCHECK) @(\ . $(VENV)/bin/activate ; env PYTHONWARNINGS= PYTHONDONTWRITEBYTECODE=1 \ sphinx-build -E $(SPHINXEXTRA) -b latex -c $(SPHINXCONFIG) -d $(BUILDDIR)/doctrees $(RSTDIR) latex ;\ - touch $(RSTDIR)/Fortran.rst ; env PYTHONWARNINGS= PYTHONDONTWRITEBYTECODE=1 \ - sphinx-build $(SPHINXEXTRA) -b latex -c $(SPHINXCONFIG) -d $(BUILDDIR)/doctrees $(RSTDIR) latex ;\ rm -f $(BUILDDIR)/doxygen/xml/run.stamp;\ echo "############################################" ; env PYTHONWARNINGS= PYTHONDONTWRITEBYTECODE=1 \ rst_anchor_check src/*.rst ;\ @@ -256,6 +258,15 @@ link_check : $(VENV) html deactivate ;\ ) +upgrade: $(VENV) + @(\ + . $(VENV)/bin/activate; \ + pip $(PIP_OPTIONS) install --upgrade pip; \ + pip $(PIP_OPTIONS) install --upgrade wheel; \ + pip $(PIP_OPTIONS) install --upgrade -r $(BUILDDIR)/utils/requirements.txt; \ + deactivate;\ + ) + xmlgen : doxygen/xml/index.xml doxygen/Doxyfile: doxygen/Doxyfile.in diff --git a/doc/lammps.1 b/doc/lammps.1 index 2901fe96852..f74f57351f5 100644 --- a/doc/lammps.1 +++ b/doc/lammps.1 @@ -1,7 +1,7 @@ -.TH LAMMPS "1" "12 June 2025" "2025-06-12" +.TH LAMMPS "1" "22 July 2025" "2025-07-22" .SH NAME .B LAMMPS -\- Molecular Dynamics Simulator. Version 12 June 2025 +\- Molecular Dynamics Simulator. Version 22 July 2025 .SH SYNOPSIS .B lmp diff --git a/doc/src/Build_development.rst b/doc/src/Build_development.rst index 5c6475c7fa1..6845079f8fe 100644 --- a/doc/src/Build_development.rst +++ b/doc/src/Build_development.rst @@ -28,28 +28,6 @@ variable VERBOSE set to 1: ---------- -.. _clang-tidy: - -Enable static code analysis with clang-tidy (CMake only) --------------------------------------------------------- - -The `clang-tidy tool `_ is a -static code analysis tool to diagnose (and potentially fix) typical -programming errors or coding style violations. It has a modular framework -of tests that can be adjusted to help identifying problems before they -become bugs and also assist in modernizing large code bases (like LAMMPS). -It can be enabled for all C++ code with the following CMake flag - -.. code-block:: bash - - -D ENABLE_CLANG_TIDY=value # value = no (default) or yes - -With this flag enabled all source files will be processed twice, first to -be compiled and then to be analyzed. Please note that the analysis can be -significantly more time-consuming than the compilation itself. - ----------- - .. _iwyu_processing: Report missing and unneeded '#include' statements (CMake only) @@ -523,7 +501,7 @@ to do this to install it via pip: .. code-block:: bash - pip install git+https://github.com/gcovr/gcovr.git + python3 -m pip install gcovr After post-processing with ``gen_coverage_html`` the results are in a folder ``coverage_html`` and can be viewed with a web browser. diff --git a/doc/src/Build_extras.rst b/doc/src/Build_extras.rst index 26cf776f4d3..39760b043f1 100644 --- a/doc/src/Build_extras.rst +++ b/doc/src/Build_extras.rst @@ -35,6 +35,7 @@ This is the list of packages that may require additional steps. :columns: 6 * :ref:`ADIOS ` + * :ref:`APIP ` * :ref:`ATC ` * :ref:`AWPMD ` * :ref:`COLVARS ` @@ -614,6 +615,9 @@ They must be specified in uppercase. * - ZEN4 - HOST - AMD Zen4 architecture + * - ZEN5 + - HOST + - AMD Zen5 architecture * - RISCV_SG2042 - HOST - SG2042 (RISC-V) CPUs @@ -668,6 +672,12 @@ They must be specified in uppercase. * - HOPPER90 - GPU - NVIDIA Hopper generation CC 9.0 + * - BLACKWELL100 + - GPU + - NVIDIA Blackwell generation CC 10.0 + * - BLACKWELL120 + - GPU + - NVIDIA Blackwell generation CC 12.0 * - AMD_GFX906 - GPU - AMD GPU MI50/60 @@ -716,8 +726,11 @@ They must be specified in uppercase. * - INTEL_PVC - GPU - Intel GPU Ponte Vecchio + * - INTEL_DG2 + - GPU + - Intel GPU DG2 -This list was last updated for version 4.6.0 of the Kokkos library. +This list was last updated for version 4.6.2 of the Kokkos library. .. tabs:: @@ -1272,6 +1285,34 @@ systems. ---------- +.. _apip: + +APIP package +----------------------------- + +The APIP package depends on the library of the +:ref:`ML-PACE ` package. +The code for the library can be found +at: `https://github.com/ICAMS/lammps-user-pace/ `_ + +.. tabs:: + + .. tab:: CMake build + + No additional settings are needed besides ``-D PKG_APIP=yes`` + and ``-D PKG_ML-PACE=yes``. + One can use a local version of the ML-PACE library instead of + automatically downloading the library as described :ref:`here `. + + + .. tab:: Traditional make + + You need to install the ML-PACE package *first* and follow + the instructions :ref:`here ` before installing + the APIP package. + +---------- + .. _atc: ATC package diff --git a/doc/src/Build_manual.rst b/doc/src/Build_manual.rst index 2fc29f584b9..25ca91ad7b2 100644 --- a/doc/src/Build_manual.rst +++ b/doc/src/Build_manual.rst @@ -57,6 +57,8 @@ Python interpreter version 3.8 or later, the ``doxygen`` tools and internet access to download additional files and tools are required. This download is usually only required once or after the documentation folder is returned to a pristine state with ``make clean-all``. +You can also upgrade those packages to their latest available versions +with ``make upgrade``. For the documentation build a python virtual environment is set up in the folder ``doc/docenv`` and various python packages are installed into @@ -82,6 +84,7 @@ folder. The following ``make`` commands are available: make clean # remove intermediate RST files created by HTML build make clean-all # remove entire build folder and any cached data + make upgrade # upgrade the python packages in the virtual environment make anchor_check # check for duplicate anchor labels make style_check # check for complete and consistent style lists diff --git a/doc/src/Build_prerequisites.rst b/doc/src/Build_prerequisites.rst index 105de35102a..e5733cf67a4 100644 --- a/doc/src/Build_prerequisites.rst +++ b/doc/src/Build_prerequisites.rst @@ -3,7 +3,7 @@ Prerequisites Which software you need to compile and use LAMMPS strongly depends on which :doc:`features and settings ` and which -:doc:`optional packages ` you are trying to include. +:doc:`optional packages ` you are trying to include. Common to all is that you need a C++ and C compiler, where the C++ compiler has to support at least the C++11 standard (note that some compilers require command-line flag to activate C++11 support). diff --git a/doc/src/Build_settings.rst b/doc/src/Build_settings.rst index 7c164099952..5b6b7f3b743 100644 --- a/doc/src/Build_settings.rst +++ b/doc/src/Build_settings.rst @@ -565,7 +565,7 @@ folder as examples of how those kinds of potential files look like and for use with the provided input examples in the ``examples`` tree. To keep the size of the distributed LAMMPS source package small, very large potential files (> 5 MBytes) are not bundled, but only downloaded on -demand when the :doc:`corresponding package ` is +demand when the :doc:`corresponding package ` is installed. This automatic download can be prevented when :doc:`building LAMMPS with CMake ` by adding the setting `-D DOWNLOAD_POTENTIALS=off` when configuring. diff --git a/doc/src/Commands_fix.rst b/doc/src/Commands_fix.rst index 4bdb3c3bc8c..d008808ea3d 100644 --- a/doc/src/Commands_fix.rst +++ b/doc/src/Commands_fix.rst @@ -22,6 +22,7 @@ OPT. * :doc:`append/atoms ` * :doc:`atc ` * :doc:`atom/swap ` + * :doc:`atom_weight/apip ` * :doc:`ave/atom ` * :doc:`ave/chunk ` * :doc:`ave/correlate ` @@ -65,7 +66,7 @@ OPT. * :doc:`electrode/conp (i) ` * :doc:`electrode/conq (i) ` * :doc:`electrode/thermo (i) ` - * :doc:`electron/stopping ` + * :doc:`electron/stopping (k) ` * :doc:`electron/stopping/fit ` * :doc:`enforce2d (k) ` * :doc:`eos/cv ` @@ -86,11 +87,14 @@ OPT. * :doc:`halt ` * :doc:`heat ` * :doc:`heat/flow ` + * :doc:`hmc ` * :doc:`hyper/global ` * :doc:`hyper/local ` * :doc:`imd ` * :doc:`indent ` * :doc:`ipi ` + * :doc:`lambda/apip ` + * :doc:`lambda_thermostat/apip ` * :doc:`langevin (k) ` * :doc:`langevin/drude ` * :doc:`langevin/eff ` @@ -113,6 +117,7 @@ OPT. * :doc:`mvv/tdpd ` * :doc:`neb ` * :doc:`neb/spin ` + * :doc:`neighbor/swap ` * :doc:`nonaffine/displacement ` * :doc:`nph (ko) ` * :doc:`nph/asphere (o) ` diff --git a/doc/src/Commands_pair.rst b/doc/src/Commands_pair.rst index 362bccb9e4e..48acf3b4995 100644 --- a/doc/src/Commands_pair.rst +++ b/doc/src/Commands_pair.rst @@ -96,7 +96,9 @@ OPT. * :doc:`eam/cd ` * :doc:`eam/cd/old ` * :doc:`eam/fs (gikot) ` + * :doc:`eam/fs/apip ` * :doc:`eam/he ` + * :doc:`eam/apip ` * :doc:`edip (o) ` * :doc:`edip/multi ` * :doc:`edpd (g) ` @@ -124,6 +126,9 @@ OPT. * :doc:`ilp/tmd (t) ` * :doc:`kolmogorov/crespi/full ` * :doc:`kolmogorov/crespi/z ` + * :doc:`lambda/input/apip ` + * :doc:`lambda/input/csp/apip ` + * :doc:`lambda/zone/apip ` * :doc:`lcbop ` * :doc:`lebedeva/z ` * :doc:`lennard/mdf ` @@ -237,6 +242,9 @@ OPT. * :doc:`oxrna2/coaxstk ` * :doc:`pace (k) ` * :doc:`pace/extrapolation (k) ` + * :doc:`pace/apip ` + * :doc:`pace/fast/apip ` + * :doc:`pace/precise/apip ` * :doc:`pedone (o) ` * :doc:`pod (k) ` * :doc:`peri/eps ` diff --git a/doc/src/Developer_updating.rst b/doc/src/Developer_updating.rst index d5700959870..6524347810a 100644 --- a/doc/src/Developer_updating.rst +++ b/doc/src/Developer_updating.rst @@ -30,6 +30,7 @@ Available topics in mostly chronological order are: - `Use Output::get_dump_by_id() instead of Output::find_dump()`_ - `Refactored grid communication using Grid3d/Grid2d classes instead of GridComm`_ - `FLERR as first argument to minimum image functions in Domain class`_ +- `Use utils::logmesg() instead of error->warning()`_ ---- @@ -655,3 +656,27 @@ New: double r2 = sqrt(delx2 * delx2 + dely2 * dely2 + delz2 * delz2); This change is **required** or else the code will not compile. + +Use utils::logmesg() instead of error->warning() +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. versionchanged:: 22Jul2025 + +The ``Error::message()`` method has been removed since its functionality +has been superseded by the :cpp:func:`utils::logmesg` function. + +Old: + +.. code-block:: c++ + + if (comm->me == 0) { + error->message(FLERR, "INFO: About to read data file: {}", filename); + } + +New: + +.. code-block:: c++ + + if (comm->me == 0) utils::logmesg(lmp, "INFO: About to read data file: {}\n", filename); + +This change is **required** or else the code will not compile. diff --git a/doc/src/Errors_common.rst b/doc/src/Errors_common.rst index 3229181d001..23c450c4fce 100644 --- a/doc/src/Errors_common.rst +++ b/doc/src/Errors_common.rst @@ -1,124 +1,149 @@ -Common problems -=============== - -If two LAMMPS runs do not produce the exact same answer on different -machines or different numbers of processors, this is typically not a -bug. In theory you should get identical answers on any number of -processors and on any machine. In practice, numerical round-off can -cause slight differences and eventual divergence of molecular dynamics -phase space trajectories within a few 100s or few 1000s of timesteps. -However, the statistical properties of the two runs (e.g. average -energy or temperature) should still be the same. - -If the :doc:`velocity ` command is used to set initial atom -velocities, a particular atom can be assigned a different velocity -when the problem is run on a different number of processors or on -different machines. If this happens, the phase space trajectories of -the two simulations will rapidly diverge. See the discussion of the -*loop* option in the :doc:`velocity ` command for details and -options that avoid this issue. - -Similarly, the :doc:`create_atoms ` command generates a -lattice of atoms. For the same physical system, the ordering and -numbering of atoms by atom ID may be different depending on the number -of processors. - -Some commands use random number generators which may be setup to -produce different random number streams on each processor and hence -will produce different effects when run on different numbers of -processors. A commonly-used example is the :doc:`fix langevin ` command for thermostatting. - -A LAMMPS simulation typically has two stages, setup and run. Most -LAMMPS errors are detected at setup time; others like a bond -stretching too far may not occur until the middle of a run. - -LAMMPS tries to flag errors and print informative error messages so -you can fix the problem. For most errors it will also print the last -input script command that it was processing. Of course, LAMMPS cannot -figure out your physics or numerical mistakes, like choosing too big a -timestep, specifying erroneous force field coefficients, or putting 2 -atoms on top of each other! If you run into errors that LAMMPS -does not catch that you think it should flag, please send an email to -the `developers `_ or create an new -topic on the dedicated `MatSci forum section `_. - -If you get an error message about an invalid command in your input -script, you can determine what command is causing the problem by -looking in the log.lammps file or using the :doc:`echo command ` -to see it on the screen. If you get an error like "Invalid ... -style", with ... being fix, compute, pair, etc, it means that you -mistyped the style name or that the command is part of an optional -package which was not compiled into your executable. The list of -available styles in your executable can be listed by using -:doc:`the -h command-line switch `. The installation and -compilation of optional packages is explained on the -:doc:`Build packages ` doc page. - -For a given command, LAMMPS expects certain arguments in a specified -order. If you mess this up, LAMMPS will often flag the error, but it -may also simply read a bogus argument and assign a value that is -valid, but not what you wanted. E.g. trying to read the string "abc" -as an integer value of 0. Careful reading of the associated doc page -for the command should allow you to fix these problems. In most cases, -where LAMMPS expects to read a number, either integer or floating point, -it performs a stringent test on whether the provided input actually -is an integer or floating-point number, respectively, and reject the -input with an error message (for instance, when an integer is required, -but a floating-point number 1.0 is provided): - -.. parsed-literal:: - - ERROR: Expected integer parameter instead of '1.0' in input script or data file - -Some commands allow for using variable references in place of numeric -constants so that the value can be evaluated and may change over the -course of a run. This is typically done with the syntax *v_name* for a -parameter, where name is the name of the variable. On the other hand, -immediate variable expansion with the syntax ${name} is performed while -reading the input and before parsing commands, - -.. note:: - - Using a variable reference (i.e. *v_name*) is only allowed if - the documentation of the corresponding command explicitly says it is. - Otherwise, you will receive an error message of this kind: - -.. parsed-literal:: - - ERROR: Expected floating point parameter instead of 'v_name' in input script or data file - -Generally, LAMMPS will print a message to the screen and logfile and -exit gracefully when it encounters a fatal error. Sometimes it will -print a WARNING to the screen and logfile and continue on; you can -decide if the WARNING is important or not. A WARNING message that is -generated in the middle of a run is only printed to the screen, not to -the logfile, to avoid cluttering up thermodynamic output. If LAMMPS -crashes or hangs without spitting out an error message first then it -could be a bug (see :doc:`this section `) or one of the following -cases: - -LAMMPS runs in the available memory a processor allows to be -allocated. Most reasonable MD runs are compute limited, not memory -limited, so this should not be a bottleneck on most platforms. Almost -all large memory allocations in the code are done via C-style malloc's -which will generate an error message if you run out of memory. -Smaller chunks of memory are allocated via C++ "new" statements. If -you are unlucky you could run out of memory just when one of these -small requests is made, in which case the code will crash or hang (in -parallel), since LAMMPS does not trap on those errors. - -Illegal arithmetic can cause LAMMPS to run slow or crash. This is -typically due to invalid physics and numerics that your simulation is -computing. If you see wild thermodynamic values or NaN values in your -LAMMPS output, something is wrong with your simulation. If you -suspect this is happening, it is a good idea to print out -thermodynamic info frequently (e.g. every timestep) via the -:doc:`thermo ` so you can monitor what is happening. -Visualizing the atom movement is also a good idea to ensure your model -is behaving as you expect. - -In parallel, one way LAMMPS can hang is due to how different MPI -implementations handle buffering of messages. If the code hangs -without an error message, it may be that you need to specify an MPI -setting or two (usually via an environment variable) to enable -buffering or boost the sizes of messages that can be buffered. +Common issues that are often regarded as bugs +============================================= + +The list below are some random notes on behavior of LAMMPS that is +sometimes unexpected or even considered a bug. Most of the time, these +are just issues of understanding how LAMMPS is implemented and +parallelized. Please also have a look at the :doc:`Error details +discussions page ` that contains recommendations for +tracking down issues and explanations for error messages that may +sometimes be confusing or need additional explanations. + +- A LAMMPS simulation typically has two stages, 1) issuing commands + and 2) run or minimize. Most LAMMPS errors are detected in stage 1), + others at the beginning of stage 2), and finally others like a bond + stretching too far may or lost atoms or bonds may not occur until the + middle of a run. + +- If two LAMMPS runs do not produce the exact same answer on different + machines or different numbers of processors, this is typically not a + bug. In theory you should get identical answers on any number of + processors and on any machine. In practice, numerical round-off can + cause slight differences and eventual divergence of molecular dynamics + phase space trajectories within a few 100s or few 1000s of timesteps. + This can be triggered by different ordering of atoms due to different + domain decompositions, but also through different CPU architectures, + different operating systems, different compilers or compiler versions, + different compiler optimization levels, different FFT libraries. + However, the statistical properties of the two runs (e.g. average + energy or temperature) should still be the same. + +- If the :doc:`velocity ` command is used to set initial atom + velocities, a particular atom can be assigned a different velocity + when the problem is run on a different number of processors or on + different machines. If this happens, the phase space trajectories of + the two simulations will rapidly diverge. See the discussion of the + *loop* option in the :doc:`velocity ` command for details + and options that avoid this issue. + +- Similarly, the :doc:`create_atoms ` command generates a + lattice of atoms. For the same physical system, the ordering and + numbering of atoms by atom ID may be different depending on the number + of processors. + +- Some commands use random number generators which may be setup to + produce different random number streams on each processor and hence + will produce different effects when run on different numbers of + processors. A commonly-used example is the :doc:`fix langevin + ` command for thermostatting. + +- LAMMPS tries to flag errors and print informative error messages so + you can fix the problem. For most errors it will also print the last + input script command that it was processing or even point to the + keyword that is causing troubles. Of course, LAMMPS cannot figure out + your physics or numerical mistakes, like choosing too big a timestep, + specifying erroneous force field coefficients, or putting 2 atoms on + top of each other! Also, LAMMPS does not know what you *intend* to + do, but very strictly applies the syntax as described in the + documentation. If you run into errors that LAMMPS does not catch that + you think it should flag, please send an email to the `developers + `_ or create an new topic on the + dedicated `MatSci forum section `_. + +- If you get an error message about an invalid command in your input + script, you can determine what command is causing the problem by + looking in the log.lammps file or using the :doc:`echo command ` + to see it on the screen. If you get an error like "Invalid ... + style", with ... being fix, compute, pair, etc, it means that you + mistyped the style name or that the command is part of an optional + package which was not compiled into your executable. The list of + available styles in your executable can be listed by using + :doc:`the -h command-line switch `. The installation and + compilation of optional packages is explained on the :doc:`Build + packages ` doc page. + +- For a given command, LAMMPS expects certain arguments in a specified + order. If you mess this up, LAMMPS will often flag the error, but it + may also simply read a bogus argument and assign a value that is + valid, but not what you wanted. E.g. trying to read the string "abc" + as an integer value of 0. Careful reading of the associated doc page + for the command should allow you to fix these problems. In most cases, + where LAMMPS expects to read a number, either integer or floating + point, it performs a stringent test on whether the provided input + actually is an integer or floating-point number, respectively, and + reject the input with an error message (for instance, when an integer + is required, but a floating-point number 1.0 is provided): + + .. parsed-literal:: + + ERROR: Expected integer parameter instead of '1.0' in input script or data file + +- Some commands allow for using variable references in place of numeric + constants so that the value can be evaluated and may change over the + course of a run. This is typically done with the syntax *v_name* for + a parameter, where name is the name of the variable. On the other + hand, immediate variable expansion with the syntax ${name} is + performed while reading the input and before parsing commands, + + .. note:: + + Using a variable reference (i.e. *v_name*) is only allowed if + the documentation of the corresponding command explicitly says it is. + Otherwise, you will receive an error message of this kind: + + .. parsed-literal:: + + ERROR: Expected floating point parameter instead of 'v_name' in input script or data file + +- Generally, LAMMPS will print a message to the screen and logfile and + exit gracefully when it encounters a fatal error. When running in + parallel this message may be stuck in an I/O buffer and LAMMPS will be + terminated before that buffer is printed. In that case you can try + adding the ``-nonblock`` or ``-nb`` command-line flag to turn off that + buffering. Please note that this should not be used for production + runs, since turning off buffering usually has a significant negative + impact on performance (even worse than :doc:`thermo_modify flush yes + `). Sometimes LAMMPS will print a WARNING to the + screen and logfile and continue on; you can decide if the WARNING is + important or not, but as a general rule do not ignore warnings that + you not understand. A WARNING message that is generated in the middle + of a run is only printed to the screen, not to the logfile, to avoid + cluttering up thermodynamic output. If LAMMPS crashes or hangs + without generating an error message first then it could be a bug + (see :doc:`this section `). + +- LAMMPS runs in the available memory a processor allows to be + allocated. Most reasonable MD runs are compute limited, not memory + limited, so this should not be a bottleneck on most platforms. Almost + all large memory allocations in the code are done via C-style malloc's + which will generate an error message if you run out of memory. + Smaller chunks of memory are allocated via C++ "new" statements. If + you are unlucky you could run out of memory just when one of these + small requests is made, in which case the code will crash or hang (in + parallel). + +- Illegal arithmetic can cause LAMMPS to run slow or crash. This is + typically due to invalid physics and numerics that your simulation is + computing. If you see wild thermodynamic values or NaN values in your + LAMMPS output, something is wrong with your simulation. If you + suspect this is happening, it is a good idea to print out + thermodynamic info frequently (e.g. every timestep) via the + :doc:`thermo ` so you can monitor what is happening. + Visualizing the atom movement is also a good idea to ensure your model + is behaving as you expect. + +- When running in parallel with MPI, one way LAMMPS can hang is because + LAMMPS has come across an error condition, but only on one or a few + MPI processes and not all of them. LAMMPS has two different "stop + with an error message" functions and the correct one has to be called + or else it will hang. diff --git a/doc/src/Errors_details.rst b/doc/src/Errors_details.rst index 4b510f4902b..db75259c7ba 100644 --- a/doc/src/Errors_details.rst +++ b/doc/src/Errors_details.rst @@ -51,8 +51,11 @@ Parallel versus serial ^^^^^^^^^^^^^^^^^^^^^^ Issues where something is "lost" or "missing" often exhibit that issue -only when running in parallel. That doesn't mean there is no problem, -only the symptoms are not triggering an error quickly. Correspondingly, +*only* when running in parallel. That doesn't mean there is no problem +when running in serial, only the symptoms are not triggering an error. +This may be because there is no domain decomposition with just one +processor and thus all atoms are accessible, or it may be because the +problem will manifest faster with smaller subdomains. Correspondingly, errors may be triggered faster with more processors and thus smaller sub-domains. @@ -244,6 +247,25 @@ equal style (or similar) variables can only be expanded before the box is defined if they do not reference anything that cannot be defined before the box (e.g. a compute or fix reference or a thermo keyword). +.. _hint13: + +Illegal ... command +^^^^^^^^^^^^^^^^^^^ + +These are a catchall error messages that used to be used a lot in LAMMPS +(also programmers are sometimes lazy). They usually include the name of +the source file and the line where the error happened. This can be used +to track down what caused the error (most often some form of syntax error) +by looking at the source code. However, this has two disadvantages: 1. one +has to check the source file from the exact same LAMMPS version, or else +the line number would be different or the core may have been rewritten and +that specific error does not exist anymore. + +The LAMMPS developers are committed to replace these too generic error +messages with more descriptive errors, e.g. listing *which* keyword was +causing the error, so that it will be much simpler to look up the +correct syntax in the manual (and without referring to the source code). + ------ .. _err0001: @@ -1029,13 +1051,15 @@ Even though the LAMMPS error message recommends to increase the "one" parameter, this may not always be the correct solution. The neighbor list overflow can also be a symptom for some other error that cannot be easily detected. For example, a frequent reason for an (unexpected) -high density are incorrect box boundaries (since LAMMPS wraps atoms back +high density are incorrect box dimensions (since LAMMPS wraps atoms back into the principal box with periodic boundaries) or coordinates provided -as fractional coordinates. In both cases, LAMMPS cannot easily know -whether the input geometry has such a high density (and thus requiring -more neighbor list storage per atom) by intention. Rather than blindly -increasing the "one" parameter, it is thus worth checking if this is -justified by the combination of density and cutoff. +as fractional coordinates (LAMMPS does not support this for data files). +In both cases, LAMMPS cannot easily know whether the input geometry has +such a high density (and thus requiring more neighbor list storage per +atom) on purpose or by accident. Rather than blindly increasing the +"one" parameter, it is thus worth checking if this is justified by the +combination of density and cutoff. This is particularly recommended +when using some tool(s) to convert input or data files. When boosting (= increasing) the "one" parameter, it is recommended to also increase the value for the "page" parameter to maintain the ratio diff --git a/doc/src/Fortran.rst b/doc/src/Fortran.rst index 3d614730680..ddd1c9f9da7 100644 --- a/doc/src/Fortran.rst +++ b/doc/src/Fortran.rst @@ -2099,7 +2099,7 @@ Procedures Bound to the :f:type:`lammps` Derived Type -------- -.. f:subroutine:: create_atoms([id,] type, x, [v,] [image,] [bexpand]) +.. f:function:: create_atoms([id,] type, x, [v,] [image,] [bexpand]) This method calls :cpp:func:`lammps_create_atoms` to create additional atoms from a given list of coordinates and a list of atom types. Additionally, @@ -2128,6 +2128,8 @@ Procedures Bound to the :f:type:`lammps` Derived Type will be created, not dropped, and the box dimensions will be extended. Default is ``.FALSE.`` :otype bexpand: logical,optional + :r atoms: number of created atoms + :rtype atoms: integer(c_int) :to: :cpp:func:`lammps_create_atoms` .. note:: @@ -2152,6 +2154,18 @@ Procedures Bound to the :f:type:`lammps` Derived Type -------- +.. f:subroutine:: create_molecule(id, jsonstr) + + Add molecule template from string with JSON data + + .. versionadded:: 22Jul2025 + + :p character(len=\*) id: desired molecule-ID + :p character(len=\*) jsonstr: string with JSON data defining the molecule template + :to: :cpp:func:`lammps_create_molecule` + +-------- + .. f:function:: find_pair_neighlist(style[, exact][, nsub][, reqid]) Find index of a neighbor list requested by a pair style. diff --git a/doc/src/Howto.rst b/doc/src/Howto.rst index cdc4efd7373..5b7d991e6bd 100644 --- a/doc/src/Howto.rst +++ b/doc/src/Howto.rst @@ -93,6 +93,7 @@ Packages howto Howto_manifold Howto_rheo Howto_spins + Howto_apip Tutorials howto =============== diff --git a/doc/src/Howto_apip.rst b/doc/src/Howto_apip.rst new file mode 100644 index 00000000000..7f47c7cf259 --- /dev/null +++ b/doc/src/Howto_apip.rst @@ -0,0 +1,225 @@ +Adaptive-precision interatomic potentials (APIP) +================================================ + +The :ref:`PKG-APIP ` enables use of adaptive-precision potentials +as described in :ref:`(Immel) `. +In the context of this package, precision refers to the accuracy of an interatomic +potential. + +Modern machine-learning (ML) potentials translate the accuracy of DFT +simulations into MD simulations, i.e., ML potentials are more accurate +compared to traditional empirical potentials. +However, this accuracy comes at a cost: there is a considerable performance +gap between the evaluation of classical and ML potentials, e.g., the force +calculation of a classical EAM potential is 100-1000 times faster compared +to the ML-based ACE method. +The evaluation time difference results in a conflict between large time and +length scales on the one hand and accuracy on the other. +This conflict is resolved by an APIP model for simulations, in which the highest precision +is required only locally but not globally. + +An APIP model uses a precise but +expensive ML potential only for a subset of atoms, while a fast +potential is used for the remaining atoms. +Whether the precise or the fast potential is used is determined +by a continuous switching parameter :math:`\lambda_i` that can be defined for each +atom :math:`i`. +The switching parameter can be adjusted dynamically during a simulation or +kept constant as explained below. + +The potential energy :math:`E_i` of an atom :math:`i` described by an +adaptive-precision +interatomic potential is given by :ref:`(Immel) ` + +.. math:: + + E_i = \lambda_i E_i^\text{(fast)} + (1-\lambda_i) E_i^\text{(precise)}, + +whereas :math:`E_i^\text{(fast)}` is the potential energy of atom :math:`i` +according to a fast interatomic potential, +:math:`E_i^\text{(precise)}` is the potential energy according to a precise +interatomic potential and :math:`\lambda_i\in[0,1]` is the +switching parameter that decides how the potential energies are weighted. + +Adaptive-precision saves computation time when the computation of the +precise potential is not required for many atoms, i.e., when +:math:`\lambda_i=1` applies for many atoms. + +The currently implemented potentials are: + +.. list-table:: + :header-rows: 1 + + * - Fast potential + - Precise potential + * - :doc:`ACE ` + - :doc:`ACE ` + * - :doc:`EAM ` + - + +In theory, any short-range potential can be used for an adaptive-precision +interatomic potential. How to implement a new (fast or precise) +adaptive-precision +potential is explained in :ref:`here `. + +The switching parameter :math:`\lambda_i` that combines the two potentials +can be dynamically calculated during a +simulation. +Alternatively, one can set a constant switching parameter before the start +of a simulation. +To run a simulation with an adaptive-precision potential, one needs the +following components: + +.. tabs:: + + .. tab:: dynamic switching parameter + + #. :doc:`atom_style apip ` so that the switching parameter :math:`\lambda_i` can be stored. + #. A fast potential: :doc:`eam/apip ` or :doc:`pace/fast/apip `. + #. A precise potential: :doc:`pace/precise/apip `. + #. :doc:`pair_style lambda/input/apip ` to calculate :math:`\lambda_i^\text{input}`, from which :math:`\lambda_i` is calculated. + #. :doc:`fix lambda/apip ` to calculate the switching parameter :math:`\lambda_i`. + #. :doc:`pair_style lambda/zone/apip ` to calculate the spatial transition zone of the switching parameter. + #. :doc:`pair_style hybrid/overlay ` to combine the previously mentioned pair_styles. + #. :doc:`fix lambda_thermostat/apip ` to conserve the energy when switching parameters change. + #. :doc:`fix atom_weight/apip ` to approximate the load caused by every atom, as the computations of the pair_styles are only required for a subset of atoms. + #. :doc:`fix balance ` to perform dynamic load balancing with the calculated load. + + .. tab:: constant switching parameter + + #. :doc:`atom_style apip ` so that the switching parameter :math:`\lambda_i` can be stored. + #. A fast potential: :doc:`eam/apip ` or :doc:`pace/fast/apip `. + #. A precise potential: :doc:`pace/precise/apip `. + #. :doc:`set ` command to set the switching parameter :math:`\lambda_i`. + #. :doc:`pair_style hybrid/overlay ` to combine the previously mentioned pair_styles. + #. :doc:`fix atom_weight/apip ` to approximate the load caused by every atom, as the computations of the pair_styles are only required for a subset of atoms. + #. :doc:`fix balance ` to perform dynamic load balancing with the calculated load. + +---------- + +Example +""""""" +.. note:: + + How to select the values of the parameters of an adaptive-precision + interatomic potential is discussed in detail in :ref:`(Immel) `. + + +.. tabs:: + + .. tab:: dynamic switching parameter + + Lines like these would appear in the input script: + + + .. code-block:: LAMMPS + + atom_style apip + comm_style tiled + + pair_style hybrid/overlay eam/fs/apip pace/precise/apip lambda/input/csp/apip fcc cutoff 5.0 lambda/zone/apip 12.0 + pair_coeff * * eam/fs/apip Cu.eam.fs Cu + pair_coeff * * pace/precise/apip Cu.yace Cu + pair_coeff * * lambda/input/csp/apip + pair_coeff * * lambda/zone/apip + + fix 2 all lambda/apip 2.5 3.0 time_averaged_zone 4.0 12.0 110 110 min_delta_lambda 0.01 + fix 3 all lambda_thermostat/apip N_rescaling 200 + fix 4 all atom_weight/apip 100 eam ace lambda/input lambda/zone all + + variable myweight atom f_4 + + fix 5 all balance 100 1.1 rcb weight var myweight + + First, the :doc:`atom_style apip ` and the communication style are set. + + .. note:: + Note, that :doc:`comm_style ` *tiled* is required for the style *rcb* of + :doc:`fix balance `, but not for APIP. + However, the flexibility offered by the balancing style *rcb*, compared to the + balancing style *shift*, is advantageous for APIP. + + An adaptive-precision EAM-ACE potential, for which the switching parameter + :math:`\lambda` is calculated from the CSP, is defined via + :doc:`pair_style hybrid/overlay `. + The fixes ensure that the switching parameter is calculated, the energy conserved, + the weight for the load balancing calculated and the load-balancing itself is done. + + .. tab:: constant switching parameter + + Lines like these would appear in the input script: + + .. code-block:: LAMMPS + + atom_style apip + comm_style tiled + + pair_style hybrid/overlay eam/fs/apip pace/precise/apip + pair_coeff * * eam/fs/apip Cu.eam.fs Cu + pair_coeff * * pace/precise/apip Cu.yace Cu + + # calculate lambda somehow + variable lambda atom ... + set group all apip/lambda v_lambda + + fix 4 all atom_weight/apip 100 eam ace lambda/input lambda/zone all + + variable myweight atom f_4 + + fix 5 all balance 100 1.1 rcb weight var myweight + + First, the :doc:`atom_style apip ` and the communication style are set. + + .. note:: + Note, that :doc:`comm_style ` *tiled* is required for the style *rcb* of + :doc:`fix balance `, but not for APIP. + However, the flexibility offered by the balancing style *rcb*, compared to the + balancing style *shift*, is advantageous for APIP. + + An adaptive-precision EAM-ACE potential is defined via + :doc:`pair_style hybrid/overlay `. + The switching parameter :math:`\lambda_i` of the adaptive-precision + EAM-ACE potential is set via the :doc:`set command `. + The parameter is not updated during the simulation. + Therefore, the potential is conservative. + The fixes ensure that the weight for the load balancing is calculated + and the load-balancing itself is done. + +---------- + +.. _implementing_new_apip_styles: + +Implementing new APIP pair styles +""""""""""""""""""""""""""""""""" + +One can introduce adaptive-precision to an existing pair style by modifying +the original pair style. +One should calculate the force +:math:`F_i = - \nabla_i \sum_j E_j^\text{original}` for a fast potential or +:math:`F_i = - (1-\nabla_i) \sum_j E_j^\text{original}` for a precise +potential from the original potential +energy :math:`E_j^\text{original}` to see where the switching parameter +:math:`\lambda_i` needs to be introduced in the force calculation. +The switching parameter :math:`\lambda_i` is known for all atoms :math:`i` +in force calculation routine. +One needs to introduce an abortion criterion based on :math:`\lambda_i` to +ensure that all not required calculations are skipped and compute time can +be saved. +Furthermore, one needs to provide the number of calculations and measure the +computation time. +Communication within the force calculation needs to be prevented to allow +effective load-balancing. +With communication, the load balancer cannot balance few calculations of the +precise potential on one processor with many computations of the fast +potential on another processor. + +All changes in the pair_style pace/apip compared to the pair_style pace +are annotated and commented. +Thus, the pair_style pace/apip can serve as an example for the implementation +of new adaptive-precision potentials. + +---------- + +.. _Immel2025_1: + +**(Immel)** Immel, Drautz and Sutmann, J Chem Phys, 162, 114119 (2025) diff --git a/doc/src/Howto_lammps_gui.rst b/doc/src/Howto_lammps_gui.rst index f6cfdefc81e..079a48c73f6 100644 --- a/doc/src/Howto_lammps_gui.rst +++ b/doc/src/Howto_lammps_gui.rst @@ -1,10 +1,15 @@ Using LAMMPS-GUI ================ +.. image:: JPG/lammps-gui-banner.png + :align: center + :scale: 75% + LAMMPS-GUI is a graphical text editor programmed using the `Qt Framework -`_ and customized for editing LAMMPS input files. It -is linked to the :ref:`LAMMPS library ` and thus can run -LAMMPS directly using the contents of the editor's text buffer as input. +`_ and customized for editing and running LAMMPS +input files. It is linked to the :ref:`LAMMPS library ` +and thus can run LAMMPS directly using the contents of the editor's text +buffer as input and without having to launch the LAMMPS executable. It *differs* from other known interfaces to LAMMPS in that it can retrieve and display information from LAMMPS *while it is running*, @@ -13,7 +18,7 @@ display visualizations created with the :doc:`dump image command LAMMPS commands and styles, and directly integrates with a collection of LAMMPS tutorials (:ref:`Gravelle1 `). -This document describes **LAMMPS-GUI version 1.6**. +This document describes **LAMMPS-GUI version 1.7**. ----- @@ -21,17 +26,20 @@ This document describes **LAMMPS-GUI version 1.6**. ---- -LAMMPS-GUI tries to provide an experience similar to what people -traditionally would have running LAMMPS using a command-line window and -the console LAMMPS executable but just rolled into a single executable: +LAMMPS-GUI aims to provide the traditional experience of running LAMMPS +using a text editor, a command-line window, and launching the LAMMPS +text-mode executable printing output to the screen, but just integrated +into a single application: -- writing & editing LAMMPS input files with a text editor -- run LAMMPS on those input file with selected command-line flags -- extract data from the created files and visualize it with and - external software +- Write and edit LAMMPS input files using the built-in text editor. +- Run LAMMPS on those input file with command-line flags to enable a + specific accelerator package (or none). +- Extract data from the created files (like trajectory files, log files + with thermodynamic data, or images) and visualize it using external + software. -That procedure is quite effective for people proficient in using the -command-line, as that allows them to use tools for the individual steps +That traditional procedure is effective for people proficient in using the +command-line, as it allows them to use the tools for the individual steps that they are most comfortable with. In fact, it is often *required* to adopt this workflow when running LAMMPS simulations on high-performance computing facilities. @@ -42,35 +50,46 @@ window or using external programs, let alone writing scripts to extract data from the generated output. It also integrates well with graphical desktop environments where the `.lmp` filename extension can be registered with LAMMPS-GUI as the executable to launch when double -clicking on such files. Also, LAMMPS-GUI has support for drag-n-drop, -i.e. an input file can be selected and then moved and dropped on the -LAMMPS-GUI executable, and LAMMPS-GUI will launch and read the file into -its buffer. In many cases LAMMPS-GUI will be integrated into the -graphical desktop environment and can be launched like other -applications. +clicking on such files using a file manager. LAMMPS-GUI also has +support for 'drag and drop' for opening inputs: an input file can +be selected and then moved and dropped on the LAMMPS-GUI executable; +LAMMPS-GUI will launch and read the file into its buffer. Input files +also can be dropped into the editor window of the running LAMMPS-GUI +application, which will close the current file and open the new file. +In many cases LAMMPS-GUI will be integrated into the graphical desktop +environment and can be launched just like any other applications from +the graphical interface. LAMMPS-GUI thus makes it easier for beginners to get started running -simple LAMMPS simulations. It is very suitable for tutorials on LAMMPS -since you only need to learn how to use a single program for most tasks -and thus time can be saved and people can focus on learning LAMMPS. +LAMMPS and is well-suited for LAMMPS tutorials, since you only need to +work with a single, ready-to-use program for most of the tasks. Plus it +is available for download as pre-compiled package for popular operating +systems (Linux, macOS, Windows). This saves time and allows users to +focus on learning LAMMPS itself, without the need to learn how to +compile LAMMPS, learn how to use the command line, or learn how to use a +separate text editor. + The tutorials at https://lammpstutorials.github.io/ are specifically -updated for use with LAMMPS-GUI and their tutorial materials can -be downloaded and edited directly from the GUI. +updated for use with LAMMPS-GUI and their tutorial materials can be +downloaded and edited directly from within the GUI while automatically +loading the matching tutorial instructions into a webbrowser. -Another design goal is to keep the barrier low when replacing part of -the functionality of LAMMPS-GUI with external tools. That said, LAMMPS-GUI -has some unique functionality that is not found elsewhere: +Yet the basic control flow remains similar to running LAMMPS from the +command line, so the barrier for replacing parts of the functionality of +LAMMPS-GUI with external tools is low. That said, LAMMPS-GUI offer some +unique features that are not easily found elsewhere: - auto-adapting to features available in the integrated LAMMPS library -- auto-completion for LAMMPS commands and options -- context-sensitive online help +- auto-completion for available LAMMPS commands and options only +- context-sensitive online help for known LAMMPS commands - start and stop of simulations via mouse or keyboard -- monitoring of simulation progress -- interactive visualization using the :doc:`dump image ` +- monitoring of simulation progress and CPU use +- interactive visualization using the LAMMPS :doc:`dump image feature ` command with the option to copy-paste the resulting settings - automatic slide show generation from dump image output at runtime - automatic plotting of thermodynamic data at runtime - inspection of binary restart files +- integration will a set of LAMMPS tutorials .. admonition:: Download LAMMPS-GUI for your platform :class: Hint @@ -93,7 +112,7 @@ has some unique functionality that is not found elsewhere: ----- -The following text provides a detailed tour of the features and +The following text provides a documentation of the features and functionality of LAMMPS-GUI. Suggestions for new features and reports of bugs are always welcome. You can use the :doc:`the same channels as for LAMMPS itself ` for that purpose. @@ -230,8 +249,8 @@ editor buffer, which may contain multiple :doc:`run ` or LAMMPS runs in a separate thread, so the GUI stays responsive and is able to interact with the running calculation and access data it -produces. It is important to note that running LAMMPS this way is -using the contents of the input buffer for the run (via the +produces. It is important to note that running LAMMPS this way is using +the contents of the input buffer for the run (via the :cpp:func:`lammps_commands_string()` function of the LAMMPS C-library interface), and **not** the original file it was read from. Thus, if there are unsaved changes in the buffer, they *will* be used. As an @@ -240,28 +259,55 @@ of a file from the *Run LAMMPS from File* menu entry or with `Ctrl-Shift-Enter`. This option may be required in some rare cases where the input uses some functionality that is not compatible with running LAMMPS from a string buffer. For consistency, any unsaved -changes in the buffer must be either saved to the file or undone -before LAMMPS can be run from a file. +changes in the buffer must be either saved to the file or undone before +LAMMPS can be run from a file. + +The line number of the currently executed command is highlighted in +green in the line number display for the *Editor* Window. .. image:: JPG/lammps-gui-running.png :align: center :scale: 75% -While LAMMPS is running, the contents of the status bar change. On -the left side there is a text indicating that LAMMPS is running, which -also indicates the number of active threads, when thread-parallel -acceleration was selected in the *Preferences* dialog. On the right +While LAMMPS is running, the contents of the status bar change. The +text fields that normally show "Ready." and the current working +directory, change into an area showing the CPU utilization in percent. +Nest to it is a text indicating that LAMMPS is running, which also +indicates the number of active threads (in case thread-parallel +acceleration was selected in the *Preferences* dialog). On the right side, a progress bar is shown that displays the estimated progress for the current :doc:`run ` or :doc:`minimize ` command. -Also, the line number of the currently executed command is highlighted -in green. +.. admonition:: CPU Utilization + :class: note + + The CPU Utilization should ideally be close to 100% times the number + of threads like in the screenshot image above. Since the GUI is + running as a separate thread, the CPU utilization can be higher, for + example when the GUI needs to work hard to keep up with the + simulation. This can be caused by having frequent thermo output or + running a simulation of a small system. In the *Preferences* dialog, + the polling interval for updating the the *Output* and *Charts* + windows can be set. The intervals may need to be lowered to not miss + data between *Charts* data updates or to avoid stalling when the + thermo output is not transferred to the *Output* window fast enough. + It is also possible to reduce the amount of data by increasing the + :doc:`thermo interval `. LAMMPS-GUI detects, if the + associated I/O buffer is by a significant percentage and will print a + warning after the run with suggested adjustments. The utilization + can also be lower, e.g. when the simulation is slowed down by the + GUI or other processes also running on the host computer and + competing with LAMMPS-GUI for GPU resources. + + .. image:: JPG/lammps-gui-buffer-warn.png + :align: center + :scale: 75% If an error occurs (in the example below the command :doc:`label