|
|
6w原子的体系
自己在学校超算平台编译的gmx2019.6 (按照sobereva老师的方法),(学校超算GPU节点:2*Intel Xeon Gold 6248,376G,4*NVIDIA GV100GL [Tesla V100 PCIe 32GB])
申请一张GPU卡
运行gmx mdrun -ntmpi 1 -ntomp # -s md.tpr -nsteps 15000 -nb gpu -bonded gpu -pme gpu
# 尝试了:1,2,4,8,12
效率分别为:61,52,50,47,42 ns/day
但是同样的体系,个人电脑(i5,2060),sobereva老师的windows版gmx2019.6, 1 mpi 12 openMP ,效率能到110ns/day
超算上编译的gmx2019.6信息:
GROMACS version: 2019.6
Precision: single
Memory model: 64 bit
MPI library: thread_mpi
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
GPU support: CUDA
SIMD instructions: AVX2_256
FFT library: fftw-3.3.8-sse2-avx-avx2-avx2_128
RDTSCP usage: enabled
TNG support: enabled
Hwloc support: disabled
Tracing support: disabled
C compiler: /usr/bin/cc GNU 4.8.5
C compiler flags: -mavx2 -mfma -O2 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
C++ compiler: /usr/bin/c++ GNU 4.8.5
C++ compiler flags: -mavx2 -mfma -std=c++11 -O2 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
CUDA compiler: /opt/pkgs/cuda/cuda-toolkit/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2019 NVIDIA Corporation;Built on Sun_Jul_28_19:07:16_PDT_2019;Cuda compilation tools, release 10.1, V10.1.243
CUDA compiler flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=compute_75;-use_fast_math;;; ;-mavx2;-mfma;-std=c++11;-O2;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast;
CUDA driver: 11.30
CUDA runtime: 10.10
windows 版的gmx2019.6信息如下:
GROMACS version: 2019.6
Precision: single
Memory model: 64 bit
MPI library: thread_mpi
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
GPU support: CUDA
SIMD instructions: AVX_256
FFT library: fftw3
RDTSCP usage: enabled
TNG support: enabled
Hwloc support: disabled
Tracing support: disabled
C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2017/Community/VC/Tools/MSVC/14.16.27023/bin/Hostx86/x64/cl.exe MSVC 19.16.27025.1
C compiler flags: /arch:AVX /DWIN32 /D_WINDOWS /W3 /MD /O2 /Ob2 /DNDEBUG
C++ compiler: C:/Program Files (x86)/Microsoft Visual Studio/2017/Community/VC/Tools/MSVC/14.16.27023/bin/Hostx86/x64/cl.exe MSVC 19.16.27025.1
C++ compiler flags: /arch:AVX /DWIN32 /D_WINDOWS /W3 /GR /EHsc /std:c++14 /Zc:__cplusplus /wd4800 /wd4355 /wd4996 /wd4305 /wd4244 /wd4101 /wd4267 /wd4090 /wd4068 /MD /O2 /Ob2 /DNDEBUG
CUDA compiler: D:/CUDA_toolkit/bin/nvcc.exe nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2019 NVIDIA Corporation;Built on Fri_Feb__8_19:08:26_Pacific_Standard_Time_2019;Cuda compilation tools, release 10.1, V10.1.105
CUDA compiler flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=compute_75;-use_fast_math;;; ;/arch:AVX;/DWIN32;/D_WINDOWS;/W3;/GR;/EHsc;/std:c++14;/Zc:__cplusplus;/wd4800;/wd4355;/wd4996;/wd4305;/wd4244;/wd4101;/wd4267;/wd4090;/wd4068;/MD;/O2;/Ob2;/DNDEBUG;
CUDA driver: 11.10
CUDA runtime: 10.10
还比较了log文件中的参数部分,除了超算版的最前面多了一句:
Non-default thread affinity set, disabling internal thread affinity
其它部分都相同
麻烦大家指点下 |
|