“第10届量子化学波函数分析与Multiwfn程序培训班将于5月4-8日于北京举办,这是一次性完整、系统学习波函数分析的各种理论知识和全面掌握强大的Multiwfn波函数分析程序使用的最不可错过的机会!请点击此链接查看详情和报名方式,欢迎参加!

“第18届北京科音分子动力学与GROMACS培训班” 将于5月23-26日于北京举办。这是一次性全面、系统学习分子动力学模拟知识和最流行的分子动力学程序GROMACS的关键机会!报名正在进行中,请点击此链接查看详情,欢迎参加!

计算化学公社

 找回密码 Forget password
 注册 Register
Views: 10337|回复 Reply: 1
打印 Print 上一主题 Last thread 下一主题 Next thread

[Gaussian/gview] Gaussian16 Benchmark

[复制链接 Copy URL]

544

帖子

3

威望

6737

eV
积分
7341

Level 6 (一方通行)

跳转到指定楼层 Go to specific reply
楼主
G16两个版本的效率比较。没啥惊天动地的内容,就是分享给大家看看,在版本区别核和并行效率上 心里有数。


原文:http://computational-chemistry.c ... ussian16-benchmark/

Gaussian 16 was released early in 2017. A binary compatible with the AVX 2 extended instruction set has been newly available. Also, with the corporation of Gaussian, Nvidia and PGI, GPGPU is now available for DFT calculation and HF calculation.
In order to grasp the fundamental performance of Gaussian 16, we got a benchmark on workstation with Haswell microarchitecture. The calculation contents are the comparison of Structural Optimization and Frequency calculation in DFT under realistic condition, and the comparison with Gaussian09 using test397 input.
As a result of benchmarking, the calculations with B3LYP/6-31G(d,p) basis set were performed at 44 cores, opt was over in 2.7 hours and Freq was over in 3.5 hours. In addition, since the default calculation accuracy got higher in test 397, it gave a slower result than Gaussian 09, but we confirmed that the AVX 2 version greatly contributes to speed improvement.

EnvironmentCPU: Intel Xeon E5-2699 v4 * 2CPU (total 44core)
Memory: DDR4 128GB 2400MHz
HDD: 1TB SATA6Gbps 10000rpm
OS: Fedora25
Gaussian 16 used for the benchmark is a Gaussian standard Binary version package optimized for AVX2. Gaussian 09 to be compared with is a Gaussian standard Binary package optimized for AVX.
ResultThe result is as follows.
In the optimization, 34 iterations were executed.

In the Optimization calculation, we confirmed the calculation speed improvement by scaling to 44 cores (thread) which is the maximum parallel number of the CPU used this time in no way inferior.
Although parallelization efficiency (strong scalability) is lower than that of Opt calculation, Freq calculation also improved to 44 parallels.

In the case of Opt & Freq calculation by DFT, we recommended to use up to 32 cores.
Next, we compared g09_AVX, g16_AVX and g16_AVX2.
g09_AVX vs g16_AVX vs g16_AVX2
Compared to the AVX version of Gaussian 09, the AVX version of Gaussian 16 is slower. This is because that in order to guarantee the calculation accuracy of several new calculation types (eg, TD-DFT frequency and anharmonic ROA, etc.) in Gaussian 16, the default integration accuracy was improved from to , and also the default DFT grid has been changed from FineGrid to UltraFine. On the other hand, the newly supported AVX2 version of Gaussian16 achieves a speed increase of 1.24 to 1.35 times compared to the AVX version of Gaussian16. Compared to SSE4 version and AVX version of Gaussian 09 (our benchmark article), compared with the effect that only speed increase of about 1.12 to 1.14 times, the AVX 2 version of Gaussian16 works very effectively.

评分 Rate

参与人数
Participants 2
eV +10 收起 理由
Reason
zyzhang + 5 谢谢分享
yjcmwgk + 5 好物!

查看全部评分 View all ratings

恍惚月余,深谙人与人之间的差距。以后还应努力学习,才能与强者比肩。

221

帖子

0

威望

6297

eV
积分
6518

Level 6 (一方通行)

跳跳猪

2#
发表于 Post on 2018-4-16 22:12:30 | 只看该作者 Only view this author
感觉格点和收敛限不设一致的对比很容易误导人啊,虽然文中说明了……
流年似水,浮生如梦。

本版积分规则 Credits rule

手机版 Mobile version|北京科音自然科学研究中心 Beijing Kein Research Center for Natural Sciences|京公网安备 11010502035419号|计算化学公社 — 北京科音旗下高水平计算化学交流论坛 ( 京ICP备14038949号-1 )|网站地图

GMT+8, 2026-4-23 11:40 , Processed in 0.154792 second(s), 21 queries , Gzip On.

快速回复 返回顶部 返回列表 Return to list