计算化学公社

标题: 求助：计算溶剂化自由能换机器报错 [打印本页]

作者
Author: xjtu_zk 时间: 2021-11-8 21:38
标题: 求助：计算溶剂化自由能换机器报错
本人使用GROMACS计算癸酸在水中的溶剂化自由能，在另一台机器上一直计算正常，换了台性能更强的工作站后，同样的输入文件却一直报错。
报错界面见下图：
(, 下载次数 Times of downloads: 22)

希望得到各位前辈的解答，谢谢！

作者
Author: sobereva 时间: 2021-11-8 21:43
两次用的GROMACS版本相同么？

作者
Author: xjtu_zk 时间: 2021-11-8 22:49

sobereva 发表于 2021-11-8 21:43
两次用的GROMACS版本相同么？

不同，老机器用的之前装的版本，新机器是才装的最新版。

作者
Author: sobereva 时间: 2021-11-8 22:59

xjtu_zk 发表于 2021-11-8 22:49
不同，老机器用的之前装的版本，新机器是才装的最新版。

用和之前相同的版本

作者
Author: xjtu_zk 时间: 2021-11-9 12:42

sobereva 发表于 2021-11-8 22:59
用和之前相同的版本

社长，装了原版本，又报了新的错。。。

GROMACS:    gmx mdrun, version 2020.1
Executable: /usr/local/gromacs/bin/gmx
Data prefix:  /usr/local/gromacs
Working dir:  /home/rlxt/gmxtrain/DINW
Command line:
  gmx mdrun -deffnm eql-0

Reading file eql-0.tpr, VERSION 2020.1 (single precision)

NOTE: Parallelization is limited by the small number of atoms,
   only starting 4 thread-MPI ranks.
   You can use the -nt and/or -ntmpi option to optimize the number of threads.

Changing nstlist from 10 to 40, rlist from 1 to 1.13

On host rlxt-Precision-7920-Tower 1 GPU selected for this run.
Mapping of GPU IDs to the 4 GPU tasks in the 4 ranks on this node:
  PP:0,PP:0,PP:0,PP:0
PP tasks will do (non-perturbed) short-ranged and most bonded interactions on the GPU
PP task will update and constrain coordinates on the CPU
Using 4 MPI threads
Using 8 OpenMP threads per tMPI thread

NOTE: DLB will not turn on during the first phase of PME tuning
starting mdrun 'DKA and water interdiffusion'
50000 steps, 100.0 ps.

Not all bonded interactions have been properly assigned to the domain decomposition cells
A list of missing interactions:
      LJC Pairs NB of 327 missing    1
Molecule type 'DKA'
the first 10 missing interactions, except for exclusions:
      LJC Pairs NB atoms 2 29          global    2 29

-------------------------------------------------------
Program:    gmx mdrun, version 2020.1
Source file: src/gromacs/domdec/domdec_topology.cpp (line 421)
MPI rank: 0 (out of 4)

Fatal error:
1 of the 552 bonded interactions could not be calculated because some atoms
involved moved further apart than the multi-body cut-off distance (0.924221
nm) or the two-body cut-off distance (1.13 nm), see option -rdd, for pairs and
tabulated bonds also see option -ddcheck

For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------

作者
Author: sobereva 时间: 2021-11-10 02:00

xjtu_zk 发表于 2021-11-9 12:42
社长，装了原版本，又报了新的错。。。

GROMACS: gmx mdrun, version 2020.1

mdrun加上-ntmpi 1试试

作者
Author: xjtu_zk 时间: 2021-11-10 10:16

sobereva 发表于 2021-11-10 02:00
mdrun加上-ntmpi 1试试

社长，加上-ntmpi 1后，又报了这样的错误：

GROMACS:    gmx mdrun, version 2020.1
Executable: /usr/local/gromacs/bin/gmx
Data prefix:  /usr/local/gromacs
Working dir:  /home/rlxt/gmxtrain/DINW
Command line:
  gmx mdrun -deffnm min-0 -ntmpi 1

Reading file min-0.tpr, VERSION 2020.1 (single precision)
1 GPU selected for this run.
Mapping of GPU IDs to the 1 GPU task in the 1 rank on this node:
  PP:0
PP tasks will do (non-perturbed) short-ranged interactions on the GPU
PP task will update and constrain coordinates on the CPU
Using 1 MPI thread

Non-default thread affinity set, disabling internal thread affinity

Using 72 OpenMP threads

-------------------------------------------------------
Program:    gmx mdrun, version 2020.1
Source file: src/gromacs/listed_forces/manage_threading.cpp (line 338)

Fatal error:
You are using 72 OpenMP threads, which is larger than GMX_OPENMP_MAX_THREADS
(64). Decrease the number of OpenMP threads or rebuild GROMACS with a larger
value for GMX_OPENMP_MAX_THREADS passed to CMake.

For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors

我机器的配置是：
CPU: 2xIntel(R) Xeon(R) Gold 5220 CPU @ 2.20GHz
GPU: NVIDIA Quadro P620
MEM: 64GB
OS: Ubuntu 20.04

作者
Author: sobereva 时间: 2021-11-11 04:02

xjtu_zk 发表于 2021-11-10 10:16
社长，加上-ntmpi 1后，又报了这样的错误：

GROMACS: gmx mdrun, version 2020.1

再加上-ntomp 64

作者
Author: xjtu_zk 时间: 2021-11-11 13:37

sobereva 发表于 2021-11-11 04:02
再加上-ntomp 64

谢谢社长！可以了

作者
Author: xjtu_zk 时间: 2021-11-12 09:22

sobereva 发表于 2021-11-11 04:02
再加上-ntomp 64

社长，这样设置以后可以计算了，但是发现一个新问题：新机器按您说的设置是开64个线程进行计算，可是与老机器的24线程进行计算相比，计算速度是一样的。这是为什么呢？

作者
Author: sobereva 时间: 2021-11-13 03:30

xjtu_zk 发表于 2021-11-12 09:22
社长，这样设置以后可以计算了，但是发现一个新问题：新机器按您说的设置是开64个线程进行计算，可是与老 ...

纯靠OpenMP方式并行，在并行线程数多的时候，效率可能明显低于纯靠thread-MPI，或者thread-MPI与OpenMP混合使用。
我这里说的是纯用CPU的情况。对于用GPU加速的情况，CPU核数很多时瓶颈就完全在GPU上，所以核再多也起不到作用。

作者
Author: xjtu_zk 时间: 2021-11-17 18:32

sobereva 发表于 2021-11-13 03:30
纯靠OpenMP方式并行，在并行线程数多的时候，效率可能明显低于纯靠thread-MPI，或者thread-MPI与OpenMP混 ...

社长，我目前使用的命令是：gmx mdrun -deffnm xxx -ntmpi 1 -ntomp 64，发现计算效率很低，应该就是您说的问题。
我尝试修改了-ntomp的值，不同值尝试后发现在-ntomp=6的时候，我这个case的计算效率最高。但是这个时候整个工作站的CPU占用率只有600%，这个工作站满载应该是7200%，感觉有很多资源闲置没用，有什么办法能完全利用工作站的资源呢？希望社长能给予解答。谢谢！

作者
Author: 喵星大佬 时间: 2021-11-17 20:40
本帖最后由喵星大佬于 2021-11-17 20:43 编辑

你要改ntomp和ntmpi啊。两个相乘=物理核数的时候优化

比如有32个物理核，那你可以用2mpi*16omp，4mpi*8omp，8mpi*4omp去测试谁更快

一般认为在纯cpu的情况下体系越大用越多的mpi更快

作者
Author: sobereva 时间: 2021-11-18 05:20

xjtu_zk 发表于 2021-11-17 18:32
社长，我目前使用的命令是：gmx mdrun -deffnm xxx -ntmpi 1 -ntomp 64，发现计算效率很低，应该就是您说 ...

了解一下并行机制

(, 下载次数 Times of downloads: 36)

gmx默认利用所有核心，会让thread-MPI线程和OpenMP线程的乘积等于所有核心数，各自多少个线程是自动确定的。你若用了-ntmpi 1结合-ntomp 6，就等于只用了1*6=6个核心，显然占用率只有600%。对于大体系长时间的模拟，可以自行试试不同的组合（让乘积等于物理核心数）找出速度最快的组合。

作者
Author: xjtu_zk 时间: 2021-11-18 09:48

sobereva 发表于 2021-11-18 05:20
了解一下并行机制

社长，我昨天已经尝试了不同的-ntomp的值，发现在6的时候速度最快。我也尝试了修改-ntmpi的值，但是发现只有1的时候可以运行，修改成其他值均报下面的错误。
Data prefix:  /usr/local/gromacs
Working dir:  /home/rlxt/gmxtrain/DINW3
Command line:
  gmx mdrun -deffnm min-0 -ntmpi 32 -ntomp 2 -pin on

Back Off! I just backed up min-0.log to ./#min-0.log.4#
Reading file min-0.tpr, VERSION 2020.1 (single precision)

-------------------------------------------------------
Program:    gmx mdrun, version 2020.1
Source file: src/gromacs/domdec/domdec.cpp (line 2277)
MPI rank: 0 (out of 32)

Fatal error:
There is no domain decomposition for 32 ranks that is compatible with the
given box and a minimum cell size of 1.17491 nm
Change the number of ranks or mdrun option -rdd or -dds
Look in the log file for details on the domain decomposition

For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors

作者
Author: xjtu_zk 时间: 2021-11-18 09:49

喵星大佬发表于 2021-11-17 20:40
你要改ntomp和ntmpi啊。两个相乘=物理核数的时候优化

比如有32个物理核，那你可以用2mpi*16omp，4mpi*8o ...

我尝试了修改ntmpi的值，但是发现只有1的时候可以运行，改成其他值都报下面的错误：
Data prefix:  /usr/local/gromacs
Working dir:  /home/rlxt/gmxtrain/DINW3
Command line:
  gmx mdrun -deffnm min-0 -ntmpi 32 -ntomp 2 -pin on

Back Off! I just backed up min-0.log to ./#min-0.log.4#
Reading file min-0.tpr, VERSION 2020.1 (single precision)

-------------------------------------------------------
Program:    gmx mdrun, version 2020.1
Source file: src/gromacs/domdec/domdec.cpp (line 2277)
MPI rank: 0 (out of 32)

Fatal error:
There is no domain decomposition for 32 ranks that is compatible with the
given box and a minimum cell size of 1.17491 nm
Change the number of ranks or mdrun option -rdd or -dds
Look in the log file for details on the domain decomposition

For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors

前辈帮忙看看是怎么回事啊？

作者
Author: 喵星大佬 时间: 2021-11-18 10:02
你的体系太小了，本来就只能这样了

作者
Author: sobereva 时间: 2021-11-19 05:16

xjtu_zk 发表于 2021-11-18 09:49
我尝试了修改ntmpi的值，但是发现只有1的时候可以运行，改成其他值都报下面的错误：
Data prefix: /usr ...

体系太小的话，本来就没法域分解

作者
Author: xjtu_zk 时间: 2021-11-19 09:28

sobereva 发表于 2021-11-19 05:16
体系太小的话，本来就没法域分解

好的明白了社长谢谢！

欢迎光临计算化学公社 (http://bbs.keinsci.com/)