|
|
超算使用VASP计算SP时报错如下:
srun: ROUTE: split_hostlist: hl=i11r2n02 tree_width 0
srun: error: i11r2n02: task 0: Out Of Memory
srun: launch/slurm: _step_signal: Terminating StepId=9105194.0
slurmstepd: error: Detected 5 oom-kill event(s) in StepId=9105194.0 cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.
[mpiexec@i11r2n02] HYDT_bscu_wait_for_completion (../../tools/bootstrap/utils/bscu_wait.c:151): one of the processes terminated badly; aborting
[mpiexec@i11r2n02] HYDT_bsci_wait_for_completion (../../tools/bootstrap/src/bsci_wait.c:36): launcher returned error waiting for completion
[mpiexec@i11r2n02] HYD_pmci_wait_for_completion (../../pm/pmiserv/pmiserv_pmci.c:521): launcher returned error waiting for completion
[mpiexec@i11r2n02] main (../../ui/mpich/mpiexec.c:1147): process manager error waiting for completion
/opt/gridview/slurm/spool/slurmd/job9105194/slurm_script: line 18: syntax error near unexpected token `done'
/opt/gridview/slurm/spool/slurmd/job9105194/slurm_script: line 18: `done'
slurmstepd: error: Detected 5 oom-kill event(s) in StepId=9105194.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.
问题起因是我在课题组服务器算DOS的过程中优化完结构算SP时出现报错,log文件里面说:
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 101588 RUNNING AT work04
= EXIT CODE: 9
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Killed (signal 9)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
超算的报错是因为运行内存不够吗?怎么解决呢?
|
|