|
|
各位老师好,
小弟组里最近新添了一台Platium 8173M+RTX2080Ti服务器,我做了下简单的速度对比,供大家参考:
17000原子体系下:
E5 2686 v4+GTX1080Ti机器156ns/day
- GROMACS version: 2019.3
- Precision: single
- Memory model: 64 bit
- MPI library: thread_mpi
- OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
- GPU support: CUDA
- SIMD instructions: AVX2_256
- FFT library: fftw-3.3.8-sse2-avx-avx2-avx2_128
- RDTSCP usage: enabled
- TNG support: enabled
- Hwloc support: disabled
- Tracing support: disabled
- C compiler: /usr/bin/cc GNU 4.8.5
- C compiler flags: -mavx2 -mfma -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
- C++ compiler: /usr/bin/c++ GNU 4.8.5
- C++ compiler flags: -mavx2 -mfma -std=c++11 -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
- CUDA compiler: /usr/local/cuda-9.1/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2017 NVIDIA Corporation;Built on Fri_Nov__3_21:07:56_CDT_2017;Cuda compilation tools, release 9.1, V9.1.85
- CUDA compiler flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_70,code=compute_70;-use_fast_math;;; ;-mavx2;-mfma;-std=c++11;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast;
- CUDA driver: 9.10
- CUDA runtime: 9.10
复制代码 Platium 8173M+RTX2080Ti机器采用gcc5.0 和avx512指令集编译 154ns/day
- GROMACS version: 2019.3
- Precision: single
- Memory model: 64 bit
- MPI library: thread_mpi
- OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
- GPU support: CUDA
- SIMD instructions: AVX_512
- FFT library: fftw-3.3.8-sse2-avx-avx2-avx2_128-avx512
- RDTSCP usage: enabled
- TNG support: enabled
- Hwloc support: disabled
- Tracing support: disabled
- C compiler: /usr/local/bin/gcc GNU 5.5.0
- C compiler flags: -mavx512f -mfma -O2 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
- C++ compiler: /usr/local/bin/g++ GNU 5.5.0
- C++ compiler flags: -mavx512f -mfma -std=c++11 -O2 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
- CUDA compiler: /usr/local/cuda-10.1/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2019 NVIDIA Corporation;Built on Wed_Apr_24_19:10:27_PDT_2019;Cuda compilation tools, release 10.1, V10.1.168
- CUDA compiler flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=compute_75;-use_fast_math;;; ;-mavx512f;-mfma;-std=c++11;-O2;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast;
- CUDA driver: 10.10
- CUDA runtime: 10.10
复制代码
同样机器采用gcc5.0 和avx2_256指令集编译152ns/day
- GROMACS version: 2019.3
- Precision: single
- Memory model: 64 bit
- MPI library: thread_mpi
- OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
- GPU support: CUDA
- SIMD instructions: AVX2_256
- FFT library: fftw-3.3.8-sse2-avx-avx2-avx2_128-avx512
- RDTSCP usage: enabled
- TNG support: enabled
- Hwloc support: disabled
- Tracing support: disabled
- C compiler: /usr/local/bin/gcc GNU 5.5.0
- C compiler flags: -mavx2 -mfma -O2 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
- C++ compiler: /usr/local/bin/g++ GNU 5.5.0
- C++ compiler flags: -mavx2 -mfma -std=c++11 -O2 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
- CUDA compiler: /usr/local/cuda-10.1/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2019 NVIDIA Corporation;Built on Wed_Apr_24_19:10:27_PDT_2019;Cuda compilation tools, release 10.1, V10.1.168
- CUDA compiler flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=compute_75;-use_fast_math;;; ;-mavx2;-mfma;-std=c++11;-O2;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast;
- CUDA driver: 10.10
- CUDA runtime: 10.10
复制代码 感觉新机器还不如老的机器呢。。。
|
评分 Rate
-
查看全部评分 View all ratings
|