Data-parallel languages like Fortran 90 express parallelism in the form of operations on data aggregates such as arrays. Misalignment of the operands of an array operation can reduce program performance on a distributed-memory parallel machine by requiring nonlocal data accesses. Determining array alignments that reduce communication is therefore a key issue in compiling such languages.
We present a framework for the automatic determination of array alignments in data-parallel languages such as Fortran 90. Our language model handles array sectioning, reductions, spreads, transpositions, and masked operations. We decompose alignment functions into three constituents: axis, stride, and offset. For each of these subproblems, we show how to solve the alignment problem for a basic block of code, possibly containing common subexpressions. Alignments are generated for all array objects in the code, both named program variables and intermediate results. The alignments obtained by our algorithms are more general than those provided by the "owner-computes" rule. Finally, we present some ideas for dealing with control flow, replication, and dynamic alignments that depend on loop induction variables.
像Fortran 90这样的数据并行语言以对数组等数据聚合的操作形式表示并行性。数组操作的操作数未对齐会通过要求非本地数据访问来降低分布式内存并行机上的程序性能。因此,确定减少通信的数组对齐方式是编译此类语言的关键问题。 P>
我们提出了一种用于自动确定数据并行语言(例如Fortran 90)中的数组对齐方式的框架。我们的语言模型处理数组分割,缩小,扩展,换位和掩码操作。我们将对齐功能分解为三个组成部分:轴,步幅和偏移量。对于这些子问题中的每一个,我们都展示了如何解决基本代码块(可能包含常见子表达式)的对齐问题。将为代码中的所有数组对象(命名的程序变量和中间结果)生成对齐。通过我们的算法获得的对齐方式比“所有者计算”规则提供的对齐方式更为通用。最后,我们提出了一些处理依赖循环归纳变量的控制流,复制和动态对齐的想法。 P>
机译:通用数据并行数组编程的异步自适应优化
机译:自动发现多比对序列中的亚分子序列域:用于多比对分割的动态编程算法。
机译:基于现场可编程门阵列的生物信息学序列比对的性能评估
机译:数据并行程序中阵列的移动和复制对齐
机译:使用CUDA进行GPU数据并行计算的序列比对。
机译:用于移动射线照相的自动准确的X射线管焦点/网格对准系统:系统描述和对准精度
机译:PERFSIM:一种用于数据并行Fortran程序自动性能分析的工具
机译:数据并行程序中的自动数组对齐