首页> 外文会议>IEEE International Parallel and Distributed Processing Symposium Workshops >Hierarchical Dynamic Loop Self-Scheduling on Distributed-Memory Systems Using an MPI+MPI Approach
【24h】

Hierarchical Dynamic Loop Self-Scheduling on Distributed-Memory Systems Using an MPI+MPI Approach

机译:使用MPI + MPI方法的分布式内存系统上的分层动态循环自调度

获取原文

摘要

Computationally-intensive loops are the primary source of parallelism in scientific applications. Such loops are often irregular and a balanced execution of their loop iterations is critical for achieving high performance. However, several factors may lead to an imbalanced load execution, such as problem characteristics, algorithmic, and systemic variations. Dynamic loop self-scheduling (DLS) techniques are devised to mitigate these factors, and consequently, improve application performance. On distributed-memory systems, DLS techniques can be implemented using a hierarchical master-worker execution model and are, therefore, called hierarchical DLS techniques. These techniques self-schedule loop iterations at two levels of hardware parallelism: across and within compute nodes. Hybrid programming approaches that combine the message passing interface (MPI) with open multi-processing (OpenMP) dominate the implementation of hierarchical DLS techniques. The MPI-3 standard includes the feature of sharing memory regions among MPI processes. This feature introduced the MPI+MPI approach that simplifies the implementation of parallel scientific applications. The present work designs and implements hierarchical DLS techniques by exploiting the MPI+MPI approach. Four well-known DLS techniques are considered in the evaluation proposed herein. The results indicate certain performance advantages of the proposed approach compared to the hybrid MPI+OpenMP approach.
机译:计算密集环循环是科学应用中的并行性的主要来源。这种循环通常是不规则的,并且它们的循环迭代的平衡执行对于实现高性能至关重要。然而,若干因素可能导致不平衡的负载执行,例如问题特征,算法和系统变化。动态循环自调度(DLS)技术被设计为减轻这些因素,从而提高应用程序性能。在分布式存储器系统上,可以使用分层主工作人员执行模型来实现DLS技术,因此称为分层DLS技术。这些技术在两个级别的硬件并行机中的自我进度循环迭代:跨越计算节点中。将消息传递接口(MPI)与开放式多处理(OpenMP)组合的混合编程方法主导了分层DLS技术的实现。 MPI-3标准包括在MPI进程之间共享内存区域的特征。此功能引入了MPI + MPI方法,简化了并行科学应用的实现。本工作设计并通过利用MPI + MPI方法来实现分层DLS技术。在本文提出的评估中考虑了四种众所周知的DLS技术。结果表明,与混合MPI + OpenMP方法相比,所提出的方法的某些性能优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号