计算化学公社

 找回密码 Forget password
 注册 Register
Views: 289|回复 Reply: 0
打印 Print 上一主题 Last thread 下一主题 Next thread

[综合交流] 在windows-ubuntu双系统中ORCA无法并行计算,出现Hwloc 2.0.2rc1-git has detected...

[复制链接 Copy URL]

125

帖子

0

威望

2054

eV
积分
2179

Level 5 (御坂)

跳转到指定楼层 Go to specific reply
楼主
把一个装有ORCA6的singularity容器拷到一个新电脑的Ubuntu22.04系统中,运行时却出现了错误:
wlc@wlc:~/wan$ singularityexec ~/software/orca6 orca orca.inp > orca.out &
[1] 11360
* hwloc2.0.2rc1-git has detected buggy sysfs package information: Two packages have
* thesame physical package id 0 but different core_siblings 0x000000ff and0x00000100
* hwlocis merging these packages into a single one assuming your Linux kernel
* doesnot support this processor correctly.
* You mayhide this warning by setting HWLOC_HIDE_ERRORS=1 in the environment.
*
* Ifhwloc does not report the right number of packages,
* pleasereport this error message to the hwloc user's mailing list,
* alongwith the files generated by the hwloc-gather-topology script.
****************************************************************************
****************************************************************************
* hwloc2.0.2rc1-git has encountered what looks like an error from the operatingsystem.
*
* L1d(cpuset 0x00001003) intersects with L2 (cpuset 0x0000f000) without inclusion!
* Erroroccurred in topology.c line 1384
*
* Thefollowing FAQ entry in the hwloc documentation may help:
*   What should I do when hwloc reports"operating system" warnings?
*Otherwise please report this error message to the hwloc user's mailing list,
* alongwith the files generated by the hwloc-gather-topology script.
****************************************************************************
[wlc:11400]*** Process received signal ***
[wlc:11400]Signal: Segmentation fault (11)
[wlc:11400]Signal code: Address not mapped (1)
[wlc:11400]Failing at address: (nil)
[wlc:11400][ 0] /lib64/libpthread.so.0(+0xf630)[0x15383a72b630]
[wlc:11400][ 1] /centos/openmpi416/lib/libopen-pal.so.40(opal_hwloc201_hwloc_bitmap_copy+0x16)[0x15383b2dd7e6]
[wlc:11400][ 2] /centos/openmpi416/lib/libopen-pal.so.40(+0xbd192)[0x15383b306192]
[wlc:11400][ 3] /centos/openmpi416/lib/libopen-pal.so.40(+0xbd2a8)[0x15383b3062a8]
[wlc:11400][ 4] /centos/openmpi416/lib/libopen-pal.so.40(+0xbd2a8)[0x15383b3062a8]
[wlc:11400][ 5] /centos/openmpi416/lib/libopen-pal.so.40(+0xbd2a8)[0x15383b3062a8]
[wlc:11400][ 6]/centos/openmpi416/lib/libopen-pal.so.40(opal_hwloc201_hwloc_topology_load+0x1f3)[0x15383b30d743]
[wlc:11400][ 7]/centos/openmpi416/lib/libopen-pal.so.40(opal_hwloc_base_get_topology+0xbc9)[0x15383b2da8e9]
[wlc:11400][ 8] /centos/openmpi416/lib/openmpi/mca_ess_hnp.so(+0x535c)[0x15383912b35c]
[wlc:11400][ 9] /centos/openmpi416/lib/libopen-rte.so.40(orte_init+0x295)[0x15383b5ebe75]
[wlc:11400][10]/centos/openmpi416/lib/libopen-rte.so.40(orte_submit_init+0x56c)[0x15383b59caac]
[wlc:11400][11] mpirun[0x400e2f]
[wlc:11400][12] /lib64/libc.so.6(__libc_start_main+0xf5)[0x15383a370555]
[wlc:11400][13] mpirun[0x400cde]
[wlc:11400]*** End of error message ***
[fileorca_tools/qcmsg.cpp, line 394]:
  .... aborting the run
产生的out文件最后的错误信息是:
……  ……
ORCAfinished by error termination in Startup
CallingCommand: mpirun -np 4 /orca600/orca_startup_mpi orca.int.tmp orca
[fileorca_tools/qcmsg.cpp, line 394]:
  .... aborting the run
……  ……
可以肯定的是:
1)容器没问题,因为这个容器已经在其他电脑上使用很久了,没出现过问题;并且这个容器在这个新电脑上只用单核计算也正常,只是在使用2核及以上核数计算时才会出现这样的错误;
2)输入文件也没问题,这是一个已经成功算过的任务;
3)新电脑的操作系统没问题。电脑主机是win10和ubuntu22.04双系统,在主机的win10系统中用虚拟机的话,可以用多核进行ORCA计算。在主机的ubuntu22.04系统中也可以用多核进行Gaussian计算。
我尝试的解决办法:
1)Google了一下,提出这种问题的很少,有一个答复说可能是主机上的openmpi没安装,难道调用容器中的openmpi还需要主机上也安装一个?那就安装吧,我就在主机上安装了和容器中相同版本的openmpi416,还是出现相同错误。
2)从错误信息上看,虽然是hwloc检测出来的软硬件兼容问题(大概意思?不太明白),还是更新成了hwloc (2.7.0-2ubuntu1),但是出现相同报错。
3)尝试过把ubuntu20.04更换为ubuntu22.04和centos8.9,也有同样的错误。
4)尝试过不用容器,直接在主机的ubuntu系统上安装orca6,也有同样错误。
难道是双系统的事?还没尝试过卸载win10(正版,没舍得)后只装一个ubuntu系统。除了单系统这一个办法,还有别的办法吗?
请大佬指教。

本版积分规则 Credits rule

手机版 Mobile version|北京科音自然科学研究中心 Beijing Kein Research Center for Natural Sciences|京公网安备 11010502035419号|计算化学公社 — 北京科音旗下高水平计算化学交流论坛 ( 京ICP备14038949号-1 )|网站地图

GMT+8, 2025-8-13 03:20 , Processed in 0.139606 second(s), 20 queries , Gzip On.

快速回复 返回顶部 返回列表 Return to list