计算化学公社

 找回密码 Forget password
 注册 Register
Views: 10253|回复 Reply: 1
打印 Print 上一主题 Last thread 下一主题 Next thread

[Gaussian/gview] Gaussian16 Benchmark

[复制链接 Copy URL]

541

帖子

3

威望

6618

eV
积分
7219

Level 6 (一方通行)

跳转到指定楼层 Go to specific reply
楼主
G16两个版本的效率比较。没啥惊天动地的内容,就是分享给大家看看,在版本区别核和并行效率上 心里有数。


原文:http://computational-chemistry.c ... ussian16-benchmark/

Gaussian 16 was released early in 2017. A binary compatible with the AVX 2 extended instruction set has been newly available. Also, with the corporation of Gaussian, Nvidia and PGI, GPGPU is now available for DFT calculation and HF calculation.
In order to grasp the fundamental performance of Gaussian 16, we got a benchmark on workstation with Haswell microarchitecture. The calculation contents are the comparison of Structural Optimization and Frequency calculation in DFT under realistic condition, and the comparison with Gaussian09 using test397 input.
As a result of benchmarking, the calculations with B3LYP/6-31G(d,p) basis set were performed at 44 cores, opt was over in 2.7 hours and Freq was over in 3.5 hours. In addition, since the default calculation accuracy got higher in test 397, it gave a slower result than Gaussian 09, but we confirmed that the AVX 2 version greatly contributes to speed improvement.

EnvironmentCPU: Intel Xeon E5-2699 v4 * 2CPU (total 44core)
Memory: DDR4 128GB 2400MHz
HDD: 1TB SATA6Gbps 10000rpm
OS: Fedora25
Gaussian 16 used for the benchmark is a Gaussian standard Binary version package optimized for AVX2. Gaussian 09 to be compared with is a Gaussian standard Binary package optimized for AVX.
ResultThe result is as follows.
In the optimization, 34 iterations were executed.

In the Optimization calculation, we confirmed the calculation speed improvement by scaling to 44 cores (thread) which is the maximum parallel number of the CPU used this time in no way inferior.
Although parallelization efficiency (strong scalability) is lower than that of Opt calculation, Freq calculation also improved to 44 parallels.

In the case of Opt & Freq calculation by DFT, we recommended to use up to 32 cores.
Next, we compared g09_AVX, g16_AVX and g16_AVX2.
g09_AVX vs g16_AVX vs g16_AVX2
Compared to the AVX version of Gaussian 09, the AVX version of Gaussian 16 is slower. This is because that in order to guarantee the calculation accuracy of several new calculation types (eg, TD-DFT frequency and anharmonic ROA, etc.) in Gaussian 16, the default integration accuracy was improved from to , and also the default DFT grid has been changed from FineGrid to UltraFine. On the other hand, the newly supported AVX2 version of Gaussian16 achieves a speed increase of 1.24 to 1.35 times compared to the AVX version of Gaussian16. Compared to SSE4 version and AVX version of Gaussian 09 (our benchmark article), compared with the effect that only speed increase of about 1.12 to 1.14 times, the AVX 2 version of Gaussian16 works very effectively.

评分 Rate

参与人数
Participants 2
eV +10 收起 理由
Reason
zyzhang + 5 谢谢分享
yjcmwgk + 5 好物!

查看全部评分 View all ratings

恍惚月余,深谙人与人之间的差距。以后还应努力学习,才能与强者比肩。

221

帖子

0

威望

6218

eV
积分
6439

Level 6 (一方通行)

跳跳猪

2#
发表于 Post on 2018-4-16 22:12:30 | 只看该作者 Only view this author
感觉格点和收敛限不设一致的对比很容易误导人啊,虽然文中说明了……
流年似水,浮生如梦。

本版积分规则 Credits rule

手机版 Mobile version|北京科音自然科学研究中心 Beijing Kein Research Center for Natural Sciences|京公网安备 11010502035419号|计算化学公社 — 北京科音旗下高水平计算化学交流论坛 ( 京ICP备14038949号-1 )|网站地图

GMT+8, 2026-2-23 18:00 , Processed in 0.212141 second(s), 21 queries , Gzip On.

快速回复 返回顶部 返回列表 Return to list