计算化学公社
标题:
slurm提交ORCA任务报错
[打印本页]
作者Author:
量化小王子
时间:
2024-2-13 11:05
标题:
slurm提交ORCA任务报错
小菜鸟在超算平台的自家账户下,账户自身是普通账户,没有root权限。安装ORCA和cp2k,整个过程没有报错,正常结束。测试
mpie
x
ec -V可以正常显示openmpi版本。现在面临的问题是,在账户下,直接以10核心运行orca任务,是可以整场结束,没有报错。但是将计算任务提交到计算节点,就报错,以单核心却可以正常运行。orca报错代码如下
ORCA finished by error termination in GTOInt
Calling Command: mpirun -np 10 /public/home/nwnuliujc/Software/ORCA-5.0/orca_gtoint_mpi 1.int.tmp 1
[file orca_tools/qcmsg.cpp, line 465]:
.... aborting the run
复制代码
而slurm错误代码如下:
An ORTE daemon has unexpectedly failed after launch and before
communicating back to mpirun. This could be caused by a number
of factors, including an inability to create a connection back
to mpirun due to a lack of common network interfaces and/or no
route found between them. Please check network connectivity
(including firewalls and network routing requirements).
--------------------------------------------------------------------------
[file orca_tools/qcmsg.cpp, line 465]:
.... aborting the run
复制代码
当时也测试了以下cp2k,直接不运行,slurm报错也如下。slurm提交脚本如下:
#!/bin/bash
#SBATCH -J p4
#SBATCH -p high
#SBATCH -N 1
#SBATCH --ntasks=10
#SBATCH --mem=100G
#SBATCH --output=%j.out
#SBATCH --error=%j.err
cd ${SLURM_SUBMIT_DIR}
echo ${SLURM_JOB_NODELIST}
echo start on $(date)
source /public/home/nwnuliujc/Software/gcc-9.3.0/env.sh
/public/home/nwnuliujc/Software/ORCA-5.0/orca 1.inp > 1.out
复制代码
尝试过在bashrc中添加
export OMPI_ALLOW_RUN_AS_ROOT=1,
export OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1。以及添加
export OMPI_MCA_btl_openib_allow_ib=1,都不行,求助应该怎么解决
作者Author:
Strange
时间:
2024-2-15 16:40
看报错感觉是集群的问题,直接问问管理员?
作者Author:
量化小王子
时间:
2024-2-15 18:31
Strange 发表于 2024-2-15 16:40
看报错感觉是集群的问题,直接问问管理员?
好的,谢谢,等集群管理员上班了,我问一下他吧
作者Author:
量化小王子
时间:
2024-10-26 16:19
JunS 发表于 2024-10-24 08:45
楼主解决了吗,遇到了同样的问题
解决了,是超算的服务器出了问题
欢迎光临 计算化学公社 (http://bbs.keinsci.com/)
Powered by Discuz! X3.3