|
|
本帖最后由 大懒猫王浩 于 2021-4-25 15:25 编辑
发表一下我的测试结果,CPU核为6132 显卡为GPU V100
首先是lammps自带的melt例子,4000原子,lj/cut/gpu
==> debug_10mpi_1GPU.txt <==
Total wall time: 0:18:06
==> debug_10mpi_2GPU.txt <==
Total wall time: 0:09:10
==> debug_10mpi_noGPU.txt <==
Total wall time: 0:04:17
==> debug_1mpi_1GPU.txt <==
Total wall time: 0:01:59
==> debug_2mpi_1GPU.txt <==
Total wall time: 0:05:08
==> debug_2mpi_2GPU.txt <==
Total wall time: 0:02:06
==> debug_3mpi_1GPU.txt <==
Total wall time: 0:06:36
==> debug_4mpi_2GPU.txt <==
Total wall time: 0:04:48
单核单GPU反而是最快的,
后面有群友希望测试更大体系。。就换了eam,构建了10万多个原子的Cu体系。测试结果如下。
==> gpu2l_10mpi_1GPU/slurm-9003096.out <==
Total wall time: 0:04:55
==> gpu2l_10mpi_2GPU/slurm-9003097.out <==
Total wall time: 0:02:41
==> gpu2l_1mpi_1GPU/slurm-9003098.out <==
Total wall time: 0:05:22
==> gpu2l_20mpi_0GPU/slurm-9003074.out <==
Total wall time: 0:09:18
==> gpu2l_20mpi_1GPU/slurm-9003099.out <==
Total wall time: 0:08:08
==> gpu2l_20mpi_2GPU/slurm-9003100.out <==
Total wall time: 0:04:22
==> gpu2l_2mpi_1GPU/slurm-9003101.out <==
Total wall time: 0:03:54
==> gpu2l_2mpi_2GPU/slurm-9003102.out <==
Total wall time: 0:02:35
==> gpu2l_3mpi_1GPU/slurm-9003129.out <==
Total wall time: 0:03:42
==> gpu2l_4mpi_1GPU/slurm-9003128.out <==
Total wall time: 0:03:36
==> gpu2l_4mpi_2GPU/slurm-9003103.out <==
Total wall time: 0:02:19
==> gpu2l_5mpi_1GPU/slurm-9003126.out <==
Total wall time: 0:04:16
感觉也就几倍的加速,,,并没有让我十分惊艳的样子,适合跑核少GPU加速的任务。。。
|
|