问题更新:
slurmstepd: error: Detected 28 oom-kill event(s) in StepId=8591583.0 cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.
srun: error: h12r3n09: task 107: Out Of Memory
srun: launch/slurm: _step_signal: Terminating StepId=8591583.0
还是内存爆了的问题,请问各位老师都怎么解决内存不够的问题啊