首页> 外文会议>ACM/IEEE Annual International Symposium on Computer Architecture >Hyper-Ap: Enhancing Associative Processing Through A Full-Stack Optimization
【24h】

Hyper-Ap: Enhancing Associative Processing Through A Full-Stack Optimization

机译:Hyper-Ap:通过全栈优化来增强关联处理

获取原文

摘要

Associative processing (AP) is a promising PIM paradigm that overcomes the von Neumann bottleneck (memory wall) by virtue of a radically different execution model. By decomposing arbitrary computations into a sequence of primitive memory operations (i.e., search and write), AP’s execution model supports concurrent SIMD computations in-situ in the memory array to eliminate the need for data movement. This execution model also provides a native support for flexible data types and only requires a minimal modification on the existing memory design (low hardware complexity). Despite these advantages, the execution model of AP has two limitations that substantially increase the execution time, i.e., 1) it can only search a single pattern in one search operation and 2) it needs to perform a write operation after each search operation. In this paper, we propose the Highly Performant Associative Processor (Hyper- AP) to fully address the aforementioned limitations. The core of Hyper- AP is an enhanced execution model that reduces the number of search and write operations needed for computations, thereby reducing the execution time. This execution model is generic and improves the performance for both CMOS-based and RRAM-based AP, but it is more beneficial for the RRAMbased AP due to the substantially reduced write operations. We then provide complete architecture and micro-architecture with several optimizations to efficiently implement Hyper-AP. In order to reduce the programming complexity, we also develop a compilation framework so that users can write C-like programs with several constraints to run applications on Hyper- AP. Several optimizations have been applied in the compilation process to exploit the unique properties of Hyper- AP. Our experimental results show that, compared with the recent work IMP, Hyper- AP achieves up to 54×/4.4× better power-/area-efficiency for various representative arithmetic operations. For the evaluated benchmarks, Hyper-AP achieves 3.3× speedup and 23.8× energy reduction on average compared with IMP. Our evaluation also confirms that the proposed execution model is more beneficial for the RRAM-based AP than its CMOS-based counterpart.
机译:关联处理(AP)是一种很有前途的PIM范例,它通过根本不同的执行模型克服了von Neumann瓶颈(内存墙)。通过将任意计算分解为一系列原始存储操作(即搜索和写入),AP的执行模型支持在内存阵列中就地进行并发SIMD计算,从而无需进行数据移动。该执行模型还提供了对灵活数据类型的本地支持,并且只需要对现有内存设计进行最小的修改(低硬件复杂度)。尽管具有这些优点,但AP的执行模型有两个局限性,这会大大增加执行时间,即1)它只能在一个搜索操作中搜索单个模式,以及2)在每个搜索操作之后都需要执行写操作。在本文中,我们提出了高性能关联处理器(Hyper-AP),以充分解决上述局限性。 Hyper-AP的核心是增强的执行模型,该模型减少了计算所需的搜索和写入操作数量,从而减少了执行时间。该执行模型是通用的,可以提高基于CMOS和基于RRAM的AP的性能,但是由于大大减少了写操作,因此对于基于RRAM的AP更为有利。然后,我们提供完整的体系结构和微体系结构,并进行多项优化以有效地实现Hyper-AP。为了降低编程复杂性,我们还开发了一个编译框架,以便用户可以编写具有多个约束的类C程序,以便在Hyper-AP上运行应用程序。在编译过程中已应用了一些优化措施,以利用Hyper-AP的独特属性。我们的实验结果表明,与最新的IMP相比,Hyper-AP在各种代表性算术运算中的功率/面积效率提高了54倍/4.4倍。对于评估基准,与IMP相比,Hyper-AP的平均速度提高了3.3倍,能耗降低了23.8倍。我们的评估还证实,与基于CMOS的对应模型相比,所建议的执行模型对基于RRAM的AP更为有益。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号