首页> 外文会议>International conference on supercomputing >A General Algorithm for Tiling the Register Level
【24h】

A General Algorithm for Tiling the Register Level

机译:平铺寄存器级别的一般算法

获取原文

摘要

Tiling is a well-known loop transformation that can be used to exploit data reuse at the register level and to improve a program's ILP. Previous work on tiling and also commercial compilers are able to perform tiling for the register level in more than one dimension when the iteration space is rectangular. However, they either cannot handle or can only handle limited cases of non-rectangular iteration spaces. Non-rectangular iteration spaces are commonly found in linear algebra algorithms or can arise as a result of applying previous transformations such as loop skewing. In this paper we present a new general algorithm to perform tiling for the register level in more than one dimension in both rectangular and non-rectangular iteration spaces. Our method uses index set splitting to distinguish loop nests that traverse boundary tiles of the tiled iteration space from loop nests that traverse non-boundary tiles. We evaluate our method using as benchmarks typical linear algebra algorithms having non-rectangular iteration spaces. Results measured on both ALPHA 21064 and MIPS R10000 machines show that our method achieves speedups in the range of 1.11 to 5.96 over commercial compilers and preprocessors able to perform optimizing code transformations.
机译:平铺是一个众所周知的循环转换,可用于利用寄存器级别的数据重用,并改进程序的ILP。以前的折叠和商业编译器的工作能够在迭代空间是矩形时在多于一个维度中的寄存器级别进行平铺。但是,它们要么无法处理,要么只能处理有限的非矩形迭代空间的情况。非矩形迭代空间通常在线性代数算法中找到,或者由于应用先前的变换而导致诸如环路偏斜的导致。在本文中,我们介绍了一种新的常规算法,用于在矩形和非矩形迭代空间中的多个维度中对寄存器级别进行平铺。我们的方法使用索引集拆分来区分循环嵌套,该循环嵌套遍历遍历非边界瓦片的循环嵌套的划线迭代空间的边界区块。我们使用具有非矩形迭代空间的基准典型线性代数算法来评估我们的方法。在Alpha 21064和MIPS R10000机器上测量的结果表明,通过商业编译器和预处理器,我们的方法在1.11至5.96的范围内实现了能够执行优化代码转换的预处理。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号