计算化学公社

 找回密码 Forget password
 注册 Register
Views: 235|回复 Reply: 1
打印 Print 上一主题 Last thread 下一主题 Next thread

[CP2K] CP2K-2025.2 Apptainer(singularity) 镜像分享

[复制链接 Copy URL]

2

帖子

1

威望

166

eV
积分
188

Level 3 能力者

本帖最后由 s8ga 于 2026-5-6 02:00 编辑

1. 前言
  站内的 CP2K Singularity 镜像都比较老了,最近需要频繁的更换计算节点,所以编译了一份带 全部依赖的CP2K 容器镜像,方便在不同服务器上使用。
  使用了 podman构建
CP2K 然后使用 apptainer(singularity 的开源版本)打包 sif
1.1 使用环境要求
  • 单机:
    不需要root、需要足够高的内核版本 (3.8+)
    无法在docker内嵌套运行 (可以在 wsl / kvm / vmware 等虚拟化方案内运行)
  • 集群:
    没尝试过

1.2 镜像列表
  文件下载链接: https://pan.baidu.com/s/1AdlUGY40yEmDj-M32Q37Xw?pwd=s8ga
  • cp2k-opensource-2025.2 (260501添加) (regtest 通过)
    标准 2025.2 build
  • cp2k-opensource-2025.2-force-avx512 (260501添加) (regtest 通过)
    解决了使用 avx2处理器 上构建的 cp2k在  avx512处理器 上使用时
    Setting real_kernel for ELPA failed 的问题 (强制编译 ELPA的 AVX512内核就好了)
  • cp2k-rocm-2026.1-gfx942 (260501添加) (实验性)
    AMD Mi300X 加速版本 (未经充分测试)
  • cp2k-mkl-2025.2-experimental (260501添加) (regtest 通过) (实验性)
    使用的是Openmpi + mkl 构建方式
    存在经验性修补内容,不推荐在生产环境中使用
    与Opensource速度基本相同

1.3 下个release
  • cp2k master branch (当有重大更新的时候)
  • cp2k 2026.2
2.使用指南
  • 先将 镜像 下载好,并将你选择的 sif文件 和 apptainer-1.4.5-3.el8-x86_64.run放入服务器你选择的位置
    1. # 1.首先切换到你放置文件的目录
    2. cd "Your preferred location"

    3. # 2.给run文件添加可执行权限
    4. chmod +x apptainer-1.4.5-3.el8-x86_64.run

    5. # 3.解压apptainer
    6. ./apptainer-1.4.5-3.el8-x86_64.run

    7. # 4.将apptainer source到PATH中
    8. # 请根据屏幕上面的提示 source指定文件
    9. # (仅供参考) source /tmp/apptainer-test/apptainer-bundle/activate-apptainer.sh

    10. # 5.你可以使用 apptainer run 来启动镜像 (下面的例子仅供参考 - 只是显示了cp2k的版本号/编译信息)
    11. apptainer run cp2k-opensource_2025.2-force-avx512.sif mpirun cp2k.psmp --version | head -n10
    12. # 或者是使用screen + apptainer shell 方式启动
    13. screen apptainer shell cp2k-opensource_2025.2-force-avx512.sif

    14. # 具体使用方式
    15. # apptainer exec cp2k-opensource_2025.2-force-avx512.sif mpirun -np <Input thread count> cp2k.psmp -i [Your Input].inp -o [Your Output].out
    16. # 或者是使用screen + apptainer shell 方式启动
    17. apptainer shell cp2k-opensource_2025.2-force-avx512.sif
    18. # <在新的shell内部>
    19. mpirun -np <Input thread count> cp2k.psmp -i [Your Input].inp -o [Your Output].out

    20. ######################################

    21. # 6.如果你想运行cp2k的regtest
    22. apptainer shell cp2k-opensource_2025.2-force-avx512.sif
    23. # 然后在容器内部运行
    24. python /opt/cp2k/tests/do_regtest.py  --mpiranks 2 --ompthreads 2 --maxtasks $(nproc) --keepalive --workbasedir /tmp --mpiexec "mpirun $MPI_RUNVAR -np {N}" $(dirname $(which cp2k.psmp)) psmp
    复制代码
  • 如果用户重新登录了,只要找到 4提示的sh文件位置,source后正常使用
    如果你想看更形象一点的使用demo的话,可以查看这个 Demo
3. 构建镜像
(更详细的信息 请参照 构建程序链接)
(构建过程所有文件已经开源)

构建机要求:
需要有podman, uv,同时需要有良好的网络连接

推荐使用 Debian / WSL2 或 VMWare 进行编译

  1. 前置条件
  2. # 0. cd 到 clone下来的目录
  3. # cd HPC-Container-Factory

  4. # 1. Python 依赖
  5. uv venv venv
  6. uv pip install -r requirements.txt --python ./venv/bin/python

  7. # 2. 需要 Podman
  8. podman info

  9. # 3. 需要 clone spack
  10. # 下载 Spack v1.1.0 release tarball
  11. mkdir -p assets
  12. curl -fSL -o assets/spack-v1.1.0.tar.gz \
  13.   https://github.com/spack/spack/releases/download/v1.1.0/spack-1.1.0.tar.gz

  14. # 解压出 spack-src/(bootstrap 阶段需要)
  15. tar -xzf assets/spack-v1.1.0.tar.gz -C assets/
  16. mv assets/spack-1.1.0 assets/spack-src

  17. # 4. 激活环境
  18. source ./activate.sh

  19. # 5. 准备离线资源 (spack 源代码包)
  20. python generate.py assets --env cp2k-opensource-2025.2-force-avx512
  21. # 预计输出
  22. # [OK]    All packages available in mirror

  23. # 6. 构建
  24. python generate.py build --app-version cp2k-opensource-2025.2-force-avx512 --network-host
  25. # 构建过程较慢 大概需要45 min +
  26. # 预计输出
  27. # Successfully tagged localhost/cp2k-opensource:2025.2-force-avx512

  28. # 7. 转换为 SIF文件
  29. python generate.py build-sif --app-version  cp2k-opensource-2025.2-force-avx512
  30. # 首次使用需要安装apptainer (需要一些时间)
  31. # 产出文件在 artifacts

  32. # 8. 运行
  33. source ./activate.sh
  34. apptainer shelll artifacts/cp2k-opensource_2025.2-force-avx512.sif

  35. # 9. [可选] 打包apptainer 便于分发至其他机器
  36. python generate.py pack-apptainer
  37. # 产物在 artifact/apptainer-<version>-x86_64.run
复制代码


如果有需要的话,构建产物可以随意分享(MIT协议)

Happy Computing!



评分 Rate

参与人数
Participants 3
威望 +1 eV +8 收起 理由
Reason
guoguoping199 + 5 好物!
dixin + 3 谢谢分享
sobereva + 1

查看全部评分 View all ratings

1599

帖子

0

威望

5173

eV
积分
6772

Level 6 (一方通行)

2#
发表于 Post on 3 hour ago | 只看该作者 Only view this author
本帖最后由 牧生 于 2026-5-6 16:52 编辑

我跑一个任务,报MPI的问题,但我无法解决。
此外,不知道为何,我的9950X3D,架构被自动认作intel的Haswell


第一个
apptainer exec cp2k-mkl_2025.2-experimental.sif mpirun -np 1 cp2k.psmp -i OPT.inp -o OPT.out
SIRIUS 7.9.0, git hash: https://api.github.com/repos/ele ... git/ref/tags/v7.9.0
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
  Proc: [[5169,1],0]
  Errorcode: 1

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
prterun has exited due to process rank 0 with PID 0 on node jing calling
"abort". This may have caused other processes in the application to be
terminated by signals sent by prterun (as reported here).
--------------------------------------------------------------------------




第二个

apptainer exec cp2k-rocm_2026.1-gfx942.sif mpirun -np 6 cp2k.psmp -i OPT.inp -o OPT.out
Core not found: ZEN4
Core: Haswell
Core not found: ZEN4
Core: Haswell
Core not found: ZEN4
Core: Haswell
Core not found: ZEN4
Core: Haswell
Core not found: ZEN4
Core: Haswell
Core not found: ZEN4
Core: Haswell
Inconsistency in warp sizes: Cuda/Hip indicates warp size = 64, while the gpu_properties files indicates warp_size = 32.
Inconsistency in warp sizes: Cuda/Hip indicates warp size = 64, while the gpu_properties files indicates warp_size = 32.
Inconsistency in warp sizes: Cuda/Hip indicates warp size = 64, while the gpu_properties files indicates warp_size = 32.
Inconsistency in warp sizes: Cuda/Hip indicates warp size = 64, while the gpu_properties files indicates warp_size = 32.
Inconsistency in warp sizes: Cuda/Hip indicates warp size = 64, while the gpu_properties files indicates warp_size = 32.
Inconsistency in warp sizes: Cuda/Hip indicates warp size = 64, while the gpu_properties files indicates warp_size = 32.
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
  Proc: [[46349,1],0]
  Errorcode: 1

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
prterun has exited due to process rank 0 with PID 0 on node jing calling
"abort". This may have caused other processes in the application to be
terminated by signals sent by prterun (as reported here).
--------------------------------------------------------------------------



第三个apptainer exec cp2k-opensource_2025.2-force-avx512.sif mpirun -np 6 cp2k.psmp -i OPT.inp -o OPT.out
Core not found: ZEN4
Core: Haswell
Core not found: ZEN4
Core: Haswell
Core not found: ZEN4
Core: Haswell
Core not found: ZEN4
Core: Haswell
Core not found: ZEN4
Core: Haswell
Core not found: ZEN4
Core: Haswell
SIRIUS 7.9.0, git hash: https://api.github.com/repos/ele ... git/ref/tags/v7.9.0
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
  Proc: [[33067,1],0]
  Errorcode: 1

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
prterun has exited due to process rank 0 with PID 0 on node jing calling
"abort". This may have caused other processes in the application to be
terminated by signals sent by prterun (as reported here).
--------------------------------------------------------------------------








又菜又爱玩

本版积分规则 Credits rule

手机版 Mobile version|北京科音自然科学研究中心 Beijing Kein Research Center for Natural Sciences|京公网安备 11010502035419号|计算化学公社 — 北京科音旗下高水平计算化学交流论坛 ( 京ICP备14038949号-1 )|网站地图

GMT+8, 2026-5-6 20:02 , Processed in 0.165508 second(s), 23 queries , Gzip On.

快速回复 返回顶部 返回列表 Return to list