Hyper-Ap: Enhancing Associative Processing Through A Full-Stack Optimization

机译：Hyper-Ap：通过全栈优化来增强关联处理

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Associative processing (AP) is a promising PIM paradigm that overcomes the von Neumann bottleneck (memory wall) by virtue of a radically different execution model. By decomposing arbitrary computations into a sequence of primitive memory operations (i.e., search and write), AP’s execution model supports concurrent SIMD computations in-situ in the memory array to eliminate the need for data movement. This execution model also provides a native support for flexible data types and only requires a minimal modification on the existing memory design (low hardware complexity). Despite these advantages, the execution model of AP has two limitations that substantially increase the execution time, i.e., 1) it can only search a single pattern in one search operation and 2) it needs to perform a write operation after each search operation. In this paper, we propose the Highly Performant Associative Processor (Hyper- AP) to fully address the aforementioned limitations. The core of Hyper- AP is an enhanced execution model that reduces the number of search and write operations needed for computations, thereby reducing the execution time. This execution model is generic and improves the performance for both CMOS-based and RRAM-based AP, but it is more beneficial for the RRAMbased AP due to the substantially reduced write operations. We then provide complete architecture and micro-architecture with several optimizations to efficiently implement Hyper-AP. In order to reduce the programming complexity, we also develop a compilation framework so that users can write C-like programs with several constraints to run applications on Hyper- AP. Several optimizations have been applied in the compilation process to exploit the unique properties of Hyper- AP. Our experimental results show that, compared with the recent work IMP, Hyper- AP achieves up to 54×/4.4× better power-/area-efficiency for various representative arithmetic operations. For the evaluated benchmarks, Hyper-AP achieves 3.3× speedup and 23.8× energy reduction on average compared with IMP. Our evaluation also confirms that the proposed execution model is more beneficial for the RRAM-based AP than its CMOS-based counterpart.

机译：关联处理（AP）是一种很有前途的PIM范例，它通过根本不同的执行模型克服了von Neumann瓶颈（内存墙）。通过将任意计算分解为一系列原始存储操作（即搜索和写入），AP的执行模型支持在内存阵列中就地进行并发SIMD计算，从而无需进行数据移动。该执行模型还提供了对灵活数据类型的本地支持，并且只需要对现有内存设计进行最小的修改（低硬件复杂度）。尽管具有这些优点，但AP的执行模型有两个局限性，这会大大增加执行时间，即1）它只能在一个搜索操作中搜索单个模式，以及2）在每个搜索操作之后都需要执行写操作。在本文中，我们提出了高性能关联处理器（Hyper-AP），以充分解决上述局限性。 Hyper-AP的核心是增强的执行模型，该模型减少了计算所需的搜索和写入操作数量，从而减少了执行时间。该执行模型是通用的，可以提高基于CMOS和基于RRAM的AP的性能，但是由于大大减少了写操作，因此对于基于RRAM的AP更为有利。然后，我们提供完整的体系结构和微体系结构，并进行多项优化以有效地实现Hyper-AP。为了降低编程复杂性，我们还开发了一个编译框架，以便用户可以编写具有多个约束的类C程序，以便在Hyper-AP上运行应用程序。在编译过程中已应用了一些优化措施，以利用Hyper-AP的独特属性。我们的实验结果表明，与最新的IMP相比，Hyper-AP在各种代表性算术运算中的功率/面积效率提高了54倍/4.4倍。对于评估基准，与IMP相比，Hyper-AP的平均速度提高了3.3倍，能耗降低了23.8倍。我们的评估还证实，与基于CMOS的对应模型相比，所建议的执行模型对基于RRAM的AP更为有益。

著录项

来源
《ACM/IEEE Annual International Symposium on Computer Architecture》|2020年|846-859|共14页
会议地点
作者
Yue Zha; Jing Li;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. An enhanced associative learning-based exploratory whale optimizer for global optimization [J] . Neural computing & applications . 2020,第9期

机译：基于增强的基于联想学习的探索性鲸鲸优化器，用于全球优化
2. Power optimization techniques for associative processors [J] . Yantir Hasan Erdem, Eltawil Ahmed M., Niar Smail, Journal of systems architecture . 2018,第期

机译：关联处理器的电源优化技术
3. Processing of performance errors predicts memory formation: Enhanced feedback-related negativities for corrected versus repeated errors in an associative learning paradigm [J] . de Bruijn Ellen R. A., Mars Rogier B., Hester Rob The European Journal of Neuroscience . 2020,第3a4期

机译：性能错误的处理预测内存形成：相关的反馈相关否定否定纠正的否定否则在关联学习范式中的重复错误
4. Imputation Using a Correlation-Enhanced Auto-Associative Neural Network with Dynamic Processing of Missing Values [C] . Xiaochen Lai, Xia Wu, Liyong Zhang, International Symposium on Neural Networks . 2019

机译：使用关联增强的自动关联神经网络进行插补并动态处理缺失值
5. Optimization of silk processing for enhanced mechanical properties. [D] . Partlow, Benjamin P. 2012

机译：优化丝加工以增强机械性能。
6. Processing of Flavor-Enhanced Oils: Optimization and Validation of Multiple Headspace Solid-Phase Microextraction-Arrow to Quantify Pyrazines in the Oils [O] . Ziyan Xu, Chuan Zhou, Haiming Shi, 2021

机译：调味增强油的加工：多个顶部空间固相微萃取箭头的优化和验证以量化油中的吡嗪
7. A NEW METHOD OF UNIFIED PROCESSING OF THE INFORMATION, REFERRED IN THE FORM OF THE THREE-DIMENSIONAL ASSOCIATIVE-LOGICAL STRUCTURE FOR OPTIMIZATION OF THE MULTIROTOR UNMANED AVIATION SYSTEM [O] . Volodymyr Kharchenko, Alexandr Nakhaba 2017

机译：一种新的信息处理方法，以三维关联 - 逻辑结构的形式提及，以优化多电泳无导体航空系统

Hyper-Ap: Enhancing Associative Processing Through A Full-Stack Optimization

摘要

著录项

相似文献

相关主题

期刊订阅