首页> 外文会议>ACM/ESDA/IEEE Design Automation Conference >Locality Aware Memory Assignment and Tiling
【24h】

Locality Aware Memory Assignment and Tiling

机译:临时意识到内存分配和平铺

获取原文

摘要

With the trend toward specialization, an efficient memory-path design is vital to capitalize customization in data-path. A monolithic memory hierarchy is often highly inefficient for irregular applications, traditionally targeted for CPUs. New approaches and tools are required to offer application-specific memory customization combining the benefits of cache and scratchpad memory simultaneously.This paper introduces a novel approach for automated application-specific on-chip memory assignment and tiling. The approach offers two major tools: (1) static memory access analysis and (2) variable-level memory assignment. Static memory analysis performs at the LLVM abstraction. It extracts target-independent pointer behaviors, measures the access strides and analyze the prefetchability of variables. (2) variable-level memory assignment creates a memory allocation graph for memory assignment (cache vs. scratchpad) based on the variables size and their estimated locality. It also explores the opportunity for tiling memory access. For the exploration and results, this paper uses Machsuite benchmarks (with both regular & irregular memory access behaviors), and gem5-Aladdin tool for performance & power evaluation. The proposed approach optimizes the memory hierarchy by automatically combining the benefits of cache, (tiled-) scratchpad at variable level granularity per individual applications. The results demonstrate more than 45% improvement in our power-stall product, on average, over the monolithic cache or scratchpad design.
机译:凭借专业化的趋势,高效的内存路径设计对于将自定义进行资本化数据路径至关重要。单片内存层次结构通常对不规则应用的效率高,传统上针对CPU。新方法和工具都需要提供特定于应用程序的内存自定义,同时组合缓存和刮板内存的好处。本文介绍了一种自动应用程序特定的片上存储器分配和平铺的新方法。该方法提供了两个主要工具:(1)静态内存访问分析和(2)可变级内存分配。静态存储器分析在LLVM抽象中执行。它提取目标独立的指针行为,测量访问进程并分析变量的可预取性。 (2)可变级别存储器分配基于变量大小及其估计的位置创建用于存储器分配(缓存与临时局)的内存分配图。它还探讨了平铺内存访问的机会。对于勘探和结果,本文使用Machsuite基准(具有常规和不规则内存访问行为)和Gem5-Aladdin工具进行性能和功率评估。该方法通过自动将缓存(Tiled-)Scratchpad的优势自动组合每个单独的应用程序的可变级别粒度的优势来优化内存层次结构。结果展示了我们的功率 - 摊位产品,平均在整体高速缓存或刮板设计上有超过45 %。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号