|
本帖最后由 乐平 于 2025-7-23 17:08 编辑
这两天遇到很奇葩的事情。我在自己的小工作站上安装了 Rocky Linux 9.6 (之前是 Ubuntu 系统)
安装完 Intel OneAPI 2025 之后,编译 vasp 6.4.2,编译过程很顺畅。但是…… 测试的时候遇到很奇葩的事情,普通用户无法运行 vasp,会报 MPI 相关的错,只有 root 才能正常运行。
普通用户的报错如下:
- (base) [huan@localhost graphite]$ module list
- No Modulefiles Currently Loaded.
- (base) [huan@localhost graphite]$ module load intel/oneapi/mkl/2025.2 intel/oneapi/mpi/2021.16
- Loading intel/oneapi/mkl/2025.2
- Loading requirement: intel/oneapi/tbb/latest intel/oneapi/compiler-rt/latest
- (base) [huan@localhost graphite]$ module list
- Currently Loaded Modulefiles:
- 1) intel/oneapi/tbb/latest 2) intel/oneapi/compiler-rt/latest 3) intel/oneapi/mkl/2025.2 4) intel/oneapi/mpi/2021.16
- Key:
- auto-loaded
- (base) [huan@localhost graphite]$ mpirun -np 4 /home/huan/apps/vasp642/bin/vasp_std
- Abort(673310095) on node 0 (rank 0 in comm 0): Fatal error in internal_Init: Other MPI error, error stack:
- internal_Init(39850).........: MPI_Init(argc=(nil), argv=(nil)) failed
- MPII_Init_thread(118)........:
- MPID_Init(1719)..............:
- MPIDI_OFI_mpi_init_hook(1740):
- create_vni_context(2343).....: OFI EP enable failed (ofi_init.c:2343:create_vni_context:Cannot allocate memory)
- Abort(673310095) on node 0 (rank 0 in comm 0): Fatal error in internal_Init: Other MPI error, error stack:
- internal_Init(39850).........: MPI_Init(argc=(nil), argv=(nil)) failed
- MPII_Init_thread(118)........:
- MPID_Init(1719)..............:
- MPIDI_OFI_mpi_init_hook(1740):
- create_vni_context(2343).....: OFI EP enable failed (ofi_init.c:2343:create_vni_context:Cannot allocate memory)
- Abort(673310095) on node 0 (rank 0 in comm 0): Fatal error in internal_Init: Other MPI error, error stack:
- internal_Init(39850).........: MPI_Init(argc=(nil), argv=(nil)) failed
- MPII_Init_thread(118)........:
- MPID_Init(1719)..............:
- MPIDI_OFI_mpi_init_hook(1740):
- create_vni_context(2343).....: OFI EP enable failed (ofi_init.c:2343:create_vni_context:Cannot allocate memory)
- Abort(673310095) on node 0 (rank 0 in comm 0): Fatal error in internal_Init: Other MPI error, error stack:
- internal_Init(39850).........: MPI_Init(argc=(nil), argv=(nil)) failed
- MPII_Init_thread(118)........:
- MPID_Init(1719)..............:
- MPIDI_OFI_mpi_init_hook(1740):
- create_vni_context(2343).....: OFI EP enable failed (ofi_init.c:2343:create_vni_context:Cannot allocate memory)
- (base) [huan@localhost graphite]$
复制代码
但是 root 却能正常调用 mpi ,能正常完成计算,如下
- (base) [root@localhost graphite]# module list
- No Modulefiles Currently Loaded.
- (base) [root@localhost graphite]#
- (base) [root@localhost graphite]# module load intel/oneapi/mkl/2025.2 intel/oneapi/mpi/2021.16
- Loading intel/oneapi/mkl/2025.2
- Loading requirement: intel/oneapi/tbb/latest intel/oneapi/compiler-rt/latest
- (base) [root@localhost graphite]#
- (base) [root@localhost graphite]# module list
- Currently Loaded Modulefiles:
- 1) intel/oneapi/tbb/latest 2) intel/oneapi/compiler-rt/latest 3) intel/oneapi/mkl/2025.2 4) intel/oneapi/mpi/2021.16
- Key:
- auto-loaded
- (base) [root@localhost graphite]#
- (base) [root@localhost graphite]# mpirun -np 4 /home/huan/apps/vasp642/bin/vasp_std
- running 4 mpi-ranks, on 1 nodes
- distrk: each k-point on 4 cores, 1 groups
- distr: one band on 1 cores, 4 groups
- vasp.6.4.2 20Jul23 (build Jul 23 2025 16:27:58) complex
-
- POSCAR found type information on POSCAR I
- POSCAR found : 1 types and 2 ions
- scaLAPACK will be used
- LDA part: xc-table for Pade appr. of Perdew
- found WAVECAR, reading the header
- POSCAR, INCAR and KPOINTS ok, starting setup
- FFT: planning ... GRIDC
- FFT: planning ... GRID_SOFT
- FFT: planning ... GRID
- reading WAVECAR
- the WAVECAR file was read successfully
- initial charge from wavefunction
- entering main loop
- N E dE d eps ncg rms rms(c)
- DAV: 1 -0.274277379318E+01 -0.27428E+01 -0.38481E-13 1152 0.599E-06 0.522E-08
- DAV: 2 -0.274277379347E+01 -0.28814E-09 0.22757E-14 1136 0.127E-06
- 1 F= -.27427738E+01 E0= -.27418118E+01 d E =-.192399E-02
- writing wavefunctions
- (base) [root@localhost graphite]#
复制代码
我也尝试了在笔记本上用虚拟机来安装 Rocky Linux 9.6,也是用 Intel OneAPI 2025 Base Kit 和 HPC kit 编译器来编译 vasp.6.4.2。神奇的是,在虚拟机上普通用户能正常运行。
不知道是什么原因在工作站上会报错……
请教大家,谢谢啦!
|
评分 Rate
-
查看全部评分 View all ratings
|