首页> 外文期刊>International Journal for Numerical Methods in Fluids >Highly scalable computational algorithms on emerging parallel machine multicore architectures: development and implementation in CFD context
【24h】

Highly scalable computational algorithms on emerging parallel machine multicore architectures: development and implementation in CFD context

机译:新兴的并行机多核体系结构上的高度可扩展计算算法:CFD上下文中的开发和实现

获取原文
获取原文并翻译 | 示例
       

摘要

In this paper, the first in a series, the authors have developed and implemented new computational algorithms for improving the scalability of CFD simulations on emerging architectures such as multicore high performance computing (HPC) platforms. These algorithmic developments and implementations are classified into three categories: (i) improved partition for multicore platforms, (ii) improved and optimized communication for HPC and (iii) enhancing scalability using computer science based methods. In the first category, the multilevel partitioning strategy was modified to reduce the number of out-of-core communications. This resulted in noticeable speedup even for small cases. In the second category, the authors came up with a next generation communication procedure optimized for the architecture and the partitioning. This next generation communication resulted in noticeable speedups. In the third category, improvements with respect to better management of memory were implemented. This again resulted in a speedup of nearly 10%. The overall scalability, as a result of the three algorithmic implementations, yielded ideal and at times superlinear scalability until 3000 processors. In general, the scalability results are very promising and indicate that the approach has a great potential for more complicated multidisciplinary problems such as fluid-structure interaction and aeroelastic simulations.
机译:在本系列文章的第一篇中,作者开发并实现了新的计算算法,以改善新兴架构(例如多核高性能计算(HPC)平台)上CFD仿真的可扩展性。这些算法的开发和实现可分为三类:(i)用于多核平台的改进的分区,(ii)用于HPC的改进和优化的通信,以及(iii)使用基于计算机科学的方法来增强可伸缩性。在第一类中,修改了多级分区策略以减少核心外通信的数量。即使在很小的情况下,这也导致明显的加速。在第二类中,作者提出了针对体系结构和分区进行了优化的下一代通信过程。这种下一代通信带来了明显的加速。在第三类中,实现了对更好的内存管理的改进。这又导致了将近10%的加速。三种算法实现的结果,总体可扩展性产生了理想的,有时甚至是超线性的可扩展性,直到3000个处理器为止。通常,可伸缩性结果非常有希望,并且表明该方法对于更复杂的多学科问题(如流固耦合和气动弹性仿真)具有巨大的潜力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号