计算化学公社

标题: ORCA集群上运行BSUB脚本求助 [打印本页]

作者
Author:
尚艳磊    时间: 2018-1-10 16:53
标题: ORCA集群上运行BSUB脚本求助
已经安装好ORCA和Openmpi,可以运行计算。但是输出文件中总会出现如下的内容:Failed to register memory region (MR):

Hostname: n0106
Address:  e92a5000
Length:   4194304
Error:    Cannot allocate memory
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Open MPI has detected that there are UD-capable Verbs devices on your
system, but none of them were able to be setup properly.  This may
indicate a problem on this system.

You job will continue, but Open MPI will ignore the "ud" oob component
in this run.

Hostname: n0106


这样的过程很耗时,浪费了计算的时间。
有没有前辈知道该怎么解决啊?
脚本内容如下:
#!/bin/bash
#BSUB -J NO2
#BSUB -n 44
#BSUB -q 1080Ti


INFILE="NO2.inp"
export LD_LIBRARY_PATH=/home/ceph/shangyl/ORCA/OPENMI/openmpi-2.0.2/openmpi/lib/openmpi:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/home/ceph/shangyl/ORCA/OPENMI/openmpi-2.0.2/openmpi/lib:$LD_LIBRARY_PATH
export PATH=/home/ceph/shangyl/ORCA/OPENMI/openmpi-2.0.2/openmpi/bin:$PATH
export PATH=/home/ceph/shangyl/ORCA/ORCA/orca_4_0_0_linux_x86-64:$PATH
export ORCA_EXEC=/home/ceph/shangyl/ORCA/ORCA/orca_4_0_0_linux_x86-64/orca



#====================   Do NOT revise any lines if you do not know their meanings    =====================================================================
#=========================================================================================================================================================
#BSUB -o %J.out
#BSUB -e %J.err
export OMP_NUM_THREADS=12



CURDIR=$PWD
rm -rf $CURDIR/nodelist.$LSB_JOBID >& /dev/null

for i in `echo $LSB_HOSTS`
do
        echo $i >> $CURDIR/nodelist.$LSB_JOBID
done

sed -i "s@n@n@g" $CURDIR/nodelist.$LSB_JOBID

NPROCS=`cat $CURDIR/nodelist.$LSB_JOBID|wc -l`

uniq $CURDIR/nodelist.$LSB_JOBID > $CURDIR/nodelist-tmp.$LSB_JOBID
for i in `cat $CURDIR/nodelist-tmp.$LSB_JOBID`
do
        CORES=`cat $CURDIR/nodelist.$LSB_JOBID|grep $i|wc -l`
        echo "$i" >> $CURDIR/nodelist-tmp2.$LSB_JOBID
done
mv $CURDIR/nodelist-tmp2.$LSB_JOBID $CURDIR/nodelist.$LSB_JOBID
rm -rf $CURDIR/nodelist-tmp.$LSB_JOBID

#cp nodelist.$LSB_JOBID  "$CURDIR/${INFILE:0:${#INFILE}-4}.nodes"

$ORCA_EXEC $INFILE &>$LSB_JOBNAME.out

rm -rf $CURDIR/nodelist.$LSB_JOBID


作者
Author:
cuifl    时间: 2021-4-27 22:59
遇到了相同额问题,请问楼主当时是怎么解决的

作者
Author:
abin    时间: 2021-4-27 23:21
cuifl 发表于 2021-4-27 22:59
遇到了相同额问题,请问楼主当时是怎么解决的

https://users.open-mpi.narkive.c ... emory-openmpi-2-0-2

Put “oob=tcp” in your default MCA param file




欢迎光临 计算化学公社 (http://bbs.keinsci.com/) Powered by Discuz! X3.3