|
|
GROMACS 2019.1安装在集群(slurm)上,使用A100节点进行MD模拟运算:
该节点信息如下:A100 GPU--8张,CPU--96核(Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz)。
gromacs安装信息:
GROMACS version: 2019.1
Precision: single
Memory model: 64 bit
MPI library: thread_mpi
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
GPU support: CUDA
GROMACS version: 2019.1
Precision: single
Memory model: 64 bit
MPI library: thread_mpi
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
GPU support: CUDA
SIMD instructions: AVX2_256
FFT library: fftw-3.3.3-sse2
RDTSCP usage: enabled
TNG support: enabled
Hwloc support: disabled
Tracing support: disabled
C compiler: /bin/cc GNU 4.8.5
C compiler flags: -mavx2 -mfma -O2 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
C++ compiler: /bin/c++ GNU 4.8.5
C++ compiler flags: -mavx2 -mfma -std=c++11 -O2 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
CUDA compiler: /usr/local/cuda-10.2/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2019 NVIDIA Corporation;Buil
t on Wed_Oct_23_19:24:38_PDT_2019;Cuda compilation tools, release 10.2, V10.2.89
CUDA compiler flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencod
e;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_6
1;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=compute_75;-use_fast_math;;; ;-mavx2;-mfma;-std=c++11;-O2;-DNDEBUG;
-funroll-all-loops;-fexcess-precision=fast;
CUDA driver: 11.60
CUDA runtime: 10.20
命令提交脚本:使用npt平衡作为测试,体系是含有7167个原子的蛋白-小分子体系
#!/bin/bash
#SBATCH --job-name=dutp_ligand
#SBATCH --partition=A100
#SBATCH --nodes=1
#SBATCH --gpus-per-task=1
#SBATCH --output=test.out.%j
#SBATCH --error=test.err.%j
gmx mdrun -deffnm npt -nb gpu ### 运算速度 27.773 (ns/day)
gmx mdrun -deffnm npt -ntmpi 1 -ntomp 12 -pme gpu ###运算速度 39.526 (ns/day)
还尝试了多卡并行,结果速度更慢,请各位老师指教指教。
|
|