...
首页> 外文期刊>IEICE Transactions on Information and Systems >Window Memory Layout Scheme for Alternate Row-Wise/Column-Wise Matrix Access
【24h】

Window Memory Layout Scheme for Alternate Row-Wise/Column-Wise Matrix Access

机译:交替行明智/列明智矩阵访问的窗口内存布局方案

获取原文
获取原文并翻译 | 示例
           

摘要

The effective bandwidth of the dynamic random-access memory (DRAM) for the alternate row-wise/column-wise matrix access (AR/CMA) mode, which is a basic characteristic in scientific and engineering applications, is very low. Therefore, we propose the window memory layout scheme (WMLS), which is a matrix layout scheme that does not require transposition, for AR/CMA applications. This scheme maps one row of a logical matrix into a rectangular memory window of the DRAM to balance the bandwidth of the row- and column-wise matrix access and to increase the DRAM IO bandwidth. The optimal window configuration is theoretically analyzed to minimize the total number of no-data-visit operations of the DRAM. Different WMLS implementationsare presented according to the memory structure of field-programmable gata array (FPGA), CPU, and GPU platforms. Experimental results show that the proposed WMLS can significantly improve DRAM bandwidth for AR/CMA applications, achieved speedup factors of 1.6x and 2.0× are achieved for the general-purpose CPU and GPU platforms, respectively. For the FPGA platform, the WMLS DRAM controller is custom. The maximum bandwidth for the AR/CMA mode reaches 5.94 GB/s, which is a 73.6% improvement compared with that of the traditional row-wise access mode. Finally, we apply WMLS scheme for Chirp Scaling SAR application, comparing with the traditional access approach, the maximum speedup factors of 4.73X, 1.33X and 1.56X can be achieved for FPGA, CPU and GPU platform, respectively.
机译:动态随机存取存储器(DRAM)在行/列矩阵交替访问(AR / CMA)模式下的有效带宽非常低,这在科学和工程应用中是基本特征。因此,对于AR / CMA应用,我们提出了窗口内存布局方案(WMLS),它是不需要换位的矩阵布局方案。该方案将逻辑矩阵的一行映射到DRAM的矩形内存窗口中,以平衡行和列矩阵访问的带宽并增加DRAM IO带宽。从理论上分析了最佳窗口配置,以最大程度地减少DRAM的无数据访问操作总数。根据现场可编程gata阵列(FPGA),CPU和GPU平台的内存结构,提出了不同的WMLS实现。实验结果表明,提出的WMLS可以显着提高AR / CMA应用程序的DRAM带宽,对于通用CPU和GPU平台,分别实现了1.6倍和2.0倍的加速比。对于FPGA平台,WMLS DRAM控制器是定制的。 AR / CMA模式的最大带宽达到5.94 GB / s,与传统的逐行访问模式相比,提高了73.6%。最后,我们将WMLS方案应用于Chirp Scaling SAR应用,与传统的访问方法相比,FPGA,CPU和GPU平台分别可以达到4.73X,1.33X和1.56X的最大加速因子。

著录项

  • 来源
    《IEICE Transactions on Information and Systems》 |2013年第12期|2765-2775|共11页
  • 作者单位

    National Laboratory for Parallel and Distribution Processing, National University of Defense Technol-ogy, Changsha, 410073, P. R. China;

    National Laboratory for Parallel and Distribution Processing, National University of Defense Technol-ogy, Changsha, 410073, P. R. China;

    National Laboratory for Parallel and Distribution Processing, National University of Defense Technol-ogy, Changsha, 410073, P. R. China;

    National Laboratory for Parallel and Distribution Processing, National University of Defense Technol-ogy, Changsha, 410073, P. R. China;

    National Laboratory for Parallel and Distribution Processing, National University of Defense Technol-ogy, Changsha, 410073, P. R. China;

    National Laboratory for Parallel and Distribution Processing, National University of Defense Technol-ogy, Changsha, 410073, P. R. China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    window memory layout scheme (WMLS); alternate row-wise/column-wise matrix access; SDRAM; GPU; FPGA;

    机译:窗口内存布局方案(WMLS);交替的行/列矩阵访问;SDRAM;GPU;现场可编程门阵列;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号