...
首页> 外文期刊>Journal of chemical theory and computation: JCTC >Integral-Direct and Parallel Implementation of the CCSD(T) Method: Algorithmic Developments and Large-Scale Applications
【24h】

Integral-Direct and Parallel Implementation of the CCSD(T) Method: Algorithmic Developments and Large-Scale Applications

机译:CCSD(T)方法的积分直接和并行实现:算法开发和大规模应用程序

获取原文
获取原文并翻译 | 示例
           

摘要

A completely integral-direct, disk I/O, and network traffic economic coupled-cluster singles, doubles, and perturbative triples [CCSD(T)] implementation has been developed relying on the density-fitting approximation. By fully exploiting the permutational symmetry, the presented algorithm is highly operation count and memory-efficient. Our measurements demonstrate excellent strong scaling achieved via hybrid MPI/OpenMP parallelization and a highly competitive, 60-70% utilization of the theoretical peak performance on up to hundreds of cores. The terms whose evaluation time becomes significant only for small- to medium-sized examples have also been extensively optimized. Consequently, high performance is also expected for systems appearing in extensive data sets used, e.g., for density functional or machine learning parametrizations, and in calculations required for certain reduced-cost or local approximations of CCSD(T), such as in our local natural orbital scheme [LNO-CCSD(T)]. The efficiency of this implementation allowed us to perform some of the largest CCSD(T) calculations ever presented for systems of 31-43 atoms and 1037-1569 orbitals using only four to eight many-core CPUs and 1-3 days of wall time. The resulting 13 correlation energies and the 12 corresponding reaction energies and barrier heights are added to our previous benchmark set collecting reference CCSD(T) results of molecules at the applicability limit of current implementations.
机译:已经开发出完全积分的直接直接,磁盘I / O和网络流量经济耦合簇单打,双打和扰动三元组[CCSD(T)]实现,依赖于密度拟合近似。通过充分利用效率对称,所提出的算法是高度操作计数和记忆效率。我们的测量结果展示了通过混合MPI / OpenMP并行化和高竞争力,60-70%的理论峰值性能的竞争优异的强大缩放,从而高达数百个核心。其评估时间仅为小于中小型示例变得显着的术语也得到了广泛优化。因此,预期高性能对于在使用的广泛数据集中出现的系统,例如用于密度函数或机器学习参数化,以及CCSD(T)的某些降低成本或局部近似所需的计算,例如在我们当地自然所需的计算中轨道方案[lno-ccsd(t)]。该实施的效率使我们能够执行31-43原子系统的一些最大的CCSD(T)计算,仅使用四到八个核心CPU和1-3天的墙壁时间使用437-1569轨道。得到的13个相关能量和12个相应的反作用能量和屏障高度被添加到我们之前的基准集合收集参考CCSD(T)的分子的适用性限制在当前实现的限制中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号