计算化学公社

标题: 求助NAMD2 伞形采样无法并行运行问题 [打印本页]

作者
Author:
hhhnano    时间: 2021-1-28 17:01
标题: 求助NAMD2 伞形采样无法并行运行问题
我的计算机配置2个CPU,每个28核,共56核,用amber50线程并行计算正常,
mpirun -np 50 sander.MPI -O -i。。。。。。

可用NAMD2进行伞形采样16线程却出现错误,NAMD2是直接从官方网站下载解压运行的,配置后单线程计算正常,是不是NAMD并行计算需要下载源代码编译才能多线程并行计算?

特请教各位,非常感谢。

mpirun -np 16 +auto-provision namd2 +replicas 16 job0.conf +stdout output/%d/job0.%d.log

Charm++> No provisioning arguments specified. Running with a single PE.
         Use +auto-provision to fully subscribe resources or +p1 to silence this message.
Charm++: standalone mode (not using charmrun)
Charm++> Running in Multicore mode: 1 threads (PEs)
------- Partition 0 Processor 0 Exiting: Called CmiAbort ------
Reason: +partitions other than 1 is not allowed for multicore build

[0] Stack Traceback:
  [0:0] namd2 0x17ce9e7
  [0:1] namd2 0x53d375
  [0:2] namd2 0x52e7c2
  [0:3] libc.so.6 0x2b4a719cf555 __libc_start_main
  [0:4] namd2 0x4145f5
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
------- Partition 0 Processor 0 Exiting: Called CmiAbort ------
Reason: +partitions other than 1 is not allowed for multicore build

Charm++> No provisioning arguments specified. Running with a single PE.
         Use +auto-provision to fully subscribe resources or +p1 to silence this message.
Charm++: standalone mode (not using charmrun)
Charm++> Running in Multicore mode: 1 threads (PEs)
[0] Stack Traceback:
  [0:0] namd2 0x17ce9e7
  [0:1] namd2 0x53d375
  [0:2] namd2 0x52e7c2
  [0:3] libc.so.6 0x2b2104bab555 __libc_start_main
  [0:4] namd2 0x4145f5
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node bogon exited on signal 11 (Segmentation fault).


作者
Author:
fhh2626    时间: 2021-1-28 23:16
不要下mpi版本,下multicore版本,namd不是通过mpi并行的
作者
Author:
hhhnano    时间: 2021-1-29 11:46
喔,多谢fhh2626指导。
作者
Author:
hhhnano    时间: 2021-1-29 11:58
我用的是multicore 版啊,我在官网看到:
#NAMD-2020
export PATH=$PATH:/home/XXX/software/NAMD_Git-2020-09-21_Linux-x86_64-multicore
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/XXX/software/NAMD_Git-2020-09-21_Linux-x86_64-multicore/lib


-- Multi-Copy Algorithm Support --

Multi-copy algorithms (such as replica exchange) require at least one
process per replica, plus a Charm++ build based on "LRTS" (low-level
run-time system).  Multi-copy-capable builds include netlrts, verbs,
and mpi.  The older net and ibverbs builds do not support multi-copy.
NAMD built on netlrts and verbs is launched with charmrun like the older
net and ibverbs layers, with +replicas <replicas> +stdout <format>
options added to divide the processes into <replicas> partitions that
write to separate log files with %d in <format> replaced by the replica.
For example, to run 8 replicas writing to rep-0.log through rep-7.log:

  charmrun namd2 ++local +p16 +replicas 8 +stdout rep-%d.log

http://www.ks.uiuc.edu/Research/namd/2.12b1/notes.html


用下列命令:
charmrun namd2 ++local +p32 +replicas 16 job0.conf +stdout output/%d/job0.%d.log
or
charmrun namd2 ++local +p16 +replicas 16 job0.conf +stdout output/%d/job0.%d.log

都出错:;

Running command: namd2 +p32 +replicas 16 job0.conf +stdout output/%d/job0.%d.log

Charm++: standalone mode (not using charmrun)
Charm++> Running in Multicore mode: 32 threads (PEs)
------- Partition 0 Processor 0 Exiting: Called CmiAbort ------
Reason: +partitions other than 1 is not allowed for multicore build

[0] Stack Traceback:
  [0:0] namd2 0x17ce9e7
  [0:1] namd2 0x53d375
  [0:2] namd2 0x52e7c2
  [0:3] libc.so.6 0x2abbc8294555 __libc_start_main
  [0:4] namd2 0x4145f5
Segmentation fault (core dumped)


作者
Author:
fhh2626    时间: 2021-1-29 17:24
hhhnano 发表于 2021-1-29 11:58
我用的是multicore 版啊,我在官网看到:
#NAMD-2020
export PATH=$PATH:/home/XXX/software/NAMD_Git-20 ...

你不是跑US吗?跟multi-copy有什么关系?

namd2 +p56 xxx.conf > xxx.log & 就行了
作者
Author:
hhhnano    时间: 2021-1-29 19:58
namd2 +p56 job0.conf > job0.log
namd2 +p16 job0.conf > job0.log

FATAL ERROR: restart with wrong number of replicas
    while executing
"error "restart with wrong number of replicas""
    invoked from within
"if { $num_replicas != $nr } {
    error "restart with wrong number of replicas"
}"
    (file "/home/xzhfood/software/NAMD_Git-2020-09-21_Linux-x86_64-multicore/lib/replica/umbrella.namd" line 35)
    invoked from within
"source /home/xzhfood/software/NAMD_Git-2020-09-21_Linux-x86_64-multicore/lib/replica/umbrella.namd "
    invoked from within
"if { ! [catch numPes] } { source /home/xzhfood/software/NAMD_Git-2020-09-21_Linux-x86_64-multicore/lib/replica/umbrella.namd }"
    (file "job0.conf" line 5)


namd2 +p16 +replicas 16 job0.conf > job0.log

------- Partition 0 Processor 0 Exiting: Called CmiAbort ------
Reason: +partitions other than 1 is not allowed for multicore build

Segmentation fault (core dumped)
我计算的是NAMD手册上的一个US例子,命令用的也是例子中的。
作者
Author:
hhhnano    时间: 2021-4-4 22:16
问题解决了。
作者
Author:
tanshy    时间: 2021-10-16 16:51
hhhnano 发表于 2021-4-4 22:16
问题解决了。

请问楼主怎么解决的啊?我用NAMD跑FEP/λREMD试了上面您说的所有命令,都是报同样的错误,Reason: +partitions other than 1 is not allowed for multicore build。不知道是命令的问题还是namd版本问题




欢迎光临 计算化学公社 (http://bbs.keinsci.com/) Powered by Discuz! X3.3