首页> 外文会议>Design, Automation Test in Europe Conference Exhibition >Prometheus: Processing-in-memory heterogeneous architecture design from a multi-layer network theoretic strategy
【24h】

Prometheus: Processing-in-memory heterogeneous architecture design from a multi-layer network theoretic strategy

机译:Prometheus:从多层网络理论策略中加工内存异构架构设计

获取原文

摘要

With increasing demand for distributed intelligent physical systems performing big data analytics on the field and in real-time, processing-in-memory (PIM) architectures integrating 3D-stacked memory and logic layers could provide higher performance and energy efficiency. Towards this end, the PIM design requires principled and rigorous optimization strategies to identify interactions and manage data movement across different vaults. In this paper, we introduce Prometheus, a novel PIM-based framework that constructs a comprehensive model of computation and communication (MoCC) based on a static and dynamic compilation of an application. Firstly, by adopting a low level virtual machine (LLVM) intermediate representation (IR), an input application is modeled as a two-layered graph consisting of (i) a computation layer in which the nodes denote computation IR instructions and edges denote data dependencies among instructions, and (ii) a communication layer in which the nodes denote memory operations (e.g., load/store) and edges represent memory dependencies detected by alias analysis. Secondly, we develop an optimization framework that partitions the multi-layer network into processing communities within which the computational workload is maximized while balancing the load among computational clusters. Thirdly, we propose a community-to-vault mapping algorithm for designing a scalable hybrid memory cube (HMC)-based system where vaults are interconnected through a network-on-chip (NoC) approach rather than a crossbar architecture. This ensures scalability to hundreds of vaults in each cube. Experimental results demonstrate that Prometheus consisting of 64 HMC-based vaults improves system performance by 9.8x and achieves 2.3x energy reduction, compared to conventional systems.
机译:随着对现场执行大数据分析的分布式智能物理系统的需求越来越大,在实时处理内存(PIM)架构集成3D堆叠存储器和逻辑层的内存(PIM)架构可以提供更高的性能和能量效率。在此目的,PIM设计需要原理和严格的优化策略来识别跨不同拱顶的交互和管理数据移动。在本文中,我们介绍了Prometheus,这是一种基于PIM的新的PIM框架,基于应用程序的静态和动态编译,构建了一个全面的计算和通信(MOCC)。首先,通过采用低级虚拟机(LLVM)中间表示(IR),输入应用程序被建模为由(i)组成的两层图,其中节点表示计算IR指令和边缘表示数据依赖性在说明书中,(ii)节点表示存储器操作(例如,加载/存储)和边缘表示通过别名分析检测的存储器依赖性的通信层。其次,我们开发了一个优化框架,将多层网络分区为处理社区,在其中计算工作负载在平衡计算集群之间的负载时最大化。第三,我们提出了一个用于设计可伸缩的混合存储器立方体(HMC)的映射算法的社区 - Vault映射算法,其中基础通过网络(NoC)方法而不是横杆架构互连。这确保了每个立方体中的数百个拱顶的可扩展性。实验结果表明,与常规系统相比,由64个基于HMC的拱顶组成的普罗米修斯将系统性能提高了9.8倍,并实现了2.3倍的能量减少。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号