标题: lammps跑起来速度异常慢 [打印本页] 作者Author: 懒洋洋喜洋洋 时间: 5 hour ago 标题: lammps跑起来速度异常慢 我最近碰到一个非常奇怪的现象,我的复现工作仍然是第一帖子的内容。根据本地测试,使用相同代码和2线程配置(OMP_NUM_THREADS=2, MKL_NUM_THREADS=2),其中发现这两个设定数字为同样才能跑起来,否则报错结束。
在这里我跑起来了lammps,LAMMPS (2 Aug 2023)。
using 2 OpenMP thread(s) per MPI task
using multi-threaded neighbor list subroutines
Reading data file ...
orthogonal box = (0 0 0) to (11.898254 11.898254 12.912077)
1 by 1 by 1 MPI processor grid
reading atoms ...
135 atoms
read_data CPU = 0.016 seconds
Allegro is using input precision f and output precision d
Allegro: Loading model from BaTiO3-E0.nequip.pth
Allegro: Freezing TorchScript model...
Type mapping:
Allegro type | Allegro name | LAMMPS type | LAMMPS name
0 | Ba | 1 | Ba
1 | Ti | 3 | Ti
2 | O | 2 | O
compute allegro will evaluate the quantity polarization of length 3 (src/compute_allegro.cpp:73)
compute allegro will evaluate the quantity polarizability of length 9 (src/compute_allegro.cpp:73)
compute allegro/atom will evaluate the quantity born_charge of length 9 with newton 1 (src/compute_allegro.cpp:63)
No /omp style for force computation currently active
Neighbor list info ...
update: every = 1 steps, delay = 0 steps, check = yes
max neighbors/atom: 2000, page size: 100000
master list distance cutoff = 8
ghost atom cutoff = 8
binsize = 4, bins = 3 3 4
1 neighbor lists, perpetual/occasional/extra = 1 0 0
(1) pair allegro, perpetual
attributes: full, newton on, ghost, omp
pair build: full/bin/ghost/omp
stencil: full/ghost/bin/3d
bin: standard
Setting up Verlet run ...
Unit style : metal
Current step : 0
Time step : 0.002
Per MPI rank memory allocation (min/avg/max) = 4.431 | 4.431 | 4.431 Mbytes
PotEng Fmax Fnorm S/CPU CPULeft
-100941.72 0.46333018 2.1087594 0 0
-100938.9 1.6549903 8.3095947 0.055789789 89442.891
-100938.95 1.7400758 8.8806422 0.067376291 81588.444
-100938.84 1.342811 8.6854009 0.069866765 77994.873
-100939.04 1.4284514 8.1433884 0.061889485 78414.17
-100938.63 2.2506794 9.3575513 0.067329591 77308.649
-100938.84 1.5925401 8.9359757 0.067317059 76524.404
-100939.08 1.4335111 8.2646267 0.063108283 76619.525
-100939.13 2.1651512 8.9049131 0.063531009 76586.408 这是我尝试计算的步骤,按照这个日志显示,应该1秒就能跑最少20步,中途测试了一下发现异常,50步需要10分钟才能结束。我申请的是borium节点c4_m8_cpu(4核)。请问一下有哪位大佬知道是什么原因吗?