首页> 外文会议>ACM/ESDA/IEEE Design Automation Conference >Locality Aware Memory Assignment and Tiling
【24h】

Locality Aware Memory Assignment and Tiling

机译:临时意识到内存分配和平铺

获取原文

摘要

With the trend toward specialization, an efficient memory-path design is vital to capitalize customization in data-path. A monolithic memory hierarchy is often highly inefficient for irregular applications, traditionally targeted for CPUs. New approaches and tools are required to offer application-specific memory customization combining the benefits of cache and scratchpad memory simultaneously. This paper introduces a novel approach for automated application-specific on-chip memory assignment and tiling. The approach offers two major tools: (1) static memory access analysis and (2) variable-level memory assignment. Static memory analysis performs at the LLVM abstraction. It extracts target-independent pointer behaviors, measures the access strides and analyze the prefetchability of variables. (2) variable-level memory assignment creates a memory allocation graph for memory assignment (cache vs. scratchpad) based on the variables size and their estimated locality. It also explores the opportunity for tiling memory access. For the exploration and results, this paper uses Machsuite benchmarks (with both regular & irregular memory access behaviors), and gem5-Aladdin tool for performance & power evaluation. The proposed approach optimizes the memory hierarchy by automatically combining the benefits of cache, (tiled-) scratchpad at variable level granularity per individual applications. The results demonstrate more than 45% improvement in our power-stall product, on average, over the monolithic cache or scratchpad design.
机译:朝向专业化的趋势,一个高效的存储器路径设计是至关重要的利用在数据通路定制。单片存储器层次往往是不规则的应用,传统的针对CPU的效率非常低。新的方法和工具都需要专用内存提供定制同时结合缓存和暂存存储器的好处。本文介绍了用于特定应用的自动片上存储器分配和平铺的新方法。该方法提供了两个主要的工具:(1)静态存储器访问分析和(2)可变级别的内存分配。在LLVM抽象静态存储器分析进行。它提取目标无关的指针行为,测量访问步幅和分析的变量prefetchability。 (2)可变级存储器分配创建基于变量的大小和其估计为局部性存储器分配的存储器分配图(缓存与暂存器)。它还探讨了碎片存储器访问的机会。勘探与结果,本文采用Machsuite基准(两者经常与不规则的内存访问行为),以及性能和电源评测gem5 - 阿拉丁工具。所提出的方法通过在每个单独的应用可变级粒度自动地合并的高速缓存,(tiled-)暂存器的好处优化存储器层级。结果表明超过45%的改善我们的力量失速产品,平均而言,在整体高速缓存或暂存器的设计。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号