Tetris: Re-architecting Convolutional Neural Network Computation for Machine Learning Accelerators

机译：俄罗斯方块：机器学习加速器的卷积神经网络计算的重新构造

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Inference efficiency is the predominant consideration in designing deep learning accelerators. Previous work mainly focuses on skipping zero values to deal with remarkable ineffectual computation, while zero bits in non-zero values, as another major source of ineffectual computation, is often ignored. The reason lies on the difficulty of extracting essential bits during operating multiply-and-accumulate (MAC) in the processing element. Based on the fact that zero bits occupy as high as 68.9% fraction in the overall weights of modern deep convolutional neural network models, this paper firstly proposes a weight kneading technique that could eliminate ineffectual computation caused by either zero value weights or zero bits in non-zero weights, simultaneously. Besides, a split-and-accumulate (SAC) computing pattern in replacement of conventional MAC, as well as the corresponding hardware accelerator design called Tetris are proposed to support weight kneading at the hardware level. Experimental results prove that Tetris could speed up inference up to 1.50x, and improve power efficiency up to 5.33x compared with the state-of-the-art baselines.

机译：推理效率是设计深度学习加速器的主要考虑因素。先前的工作主要集中在跳过零值以处理显着的无效计算，而作为非有效计算的另一个主要来源的非零值中的零位通常被忽略。原因在于难以在处理元件中的乘法与累加（MAC）期间提取基本位。基于零位在现代深度卷积神经网络模型的总体权重中占高达68.9％的事实，本文首先提出了一种权重揉合技术，该技术可以消除由零值权重或零位导致的无效计算。 -零权重，同时。此外，提出了一种替代传统MAC的分加累加（SAC）计算模式以及相应的称为Tetris的硬件加速器设计，以支持在硬件级别进行重量捏合。实验结果证明，与最新基准相比，俄罗斯方块可以将推理速度提高至1.50倍，并将电源效率提高至5.33倍。

著录项

来源
《IEEE/ACM International Conference on Computer-Aided Design》|2018年|1-8|共8页
会议地点
作者
Hang Lu; Xin Wei; Ning Lin; Guihai Yan; Xiaowei Li;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Computational modeling; Computational efficiency; Adders; Throughput; Kernel; Computer architecture; Hardware;

机译：计算建模;计算效率;加法器;吞吐量;内核;计算机体系结构;硬件;

相似文献

外文文献
中文文献
专利

1. Advances in artificial neural networks, machine learning and computational intelligence Selected papers from the 26th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2018) [J] . Oneto Luca, Bunte Kerstin, Schleif Frank-Michael Neurocomputing . 2019,第MAYa21期

机译：人工神经网络，机器学习和计算智能的进步选自第26届欧洲人工神经网络，计算智能和机器学习研讨会（ESANN 2018）的论文
2. Advances in artificial neural networks, machine learning and computational intelligence Selected papers from the 26th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2018) [J] . Oneto Luca, Bunte Kerstin, Schleif Frank-Michael Neurocomputing . 2019,第May21期

机译：人工神经网络，机器学习和计算智能的进步从第26届欧洲人工神经网络，计算智能和机器学习中选择了论文（拍摄2018）
3. Evaluation of Different Machine Learning Methods and Deep-Learning Convolutional Neural Networks for Landslide Detection [J] . Omid Ghorbanzadeh, Thomas Blaschke, Khalil Gholamnia, Remote Sensing . 2019,第2期

机译：评估不同机器学习方法和深度学习卷积神经网络进行滑坡检测
4. Tetris: Re-architecting Convolutional Neural Network Computation for Machine Learning Accelerators [C] . Hang Lu, Xin Wei, Ning Lin, IEEE/ACM International Conference on Computer-Aided Design . 2018

机译：TETRIS：重新构建机器学习加速器的卷积神经网络计算
5. Programmable Manycore Accelerator for Machine Learning, Convolution Neural Network and Binary Neural Network [D] . Kulkarni, Adwaya Amey. 2017

机译：面向机器学习，卷积神经网络和二进制神经网络的可编程Manycore加速器
6. Machine learning material properties from the periodic table using convolutional neural networks [O] . Xiaolong Zheng, Peng Zheng, Rui-Zhi Zhang 2018

机译：使用卷积神经网络从元素周期表中机器学习材料属性
7. 11.0 Tera-FLOP/second photonic convolutional accelerator for deep learning optical neural networks [O] . David Moss 2020

机译：用于深度学习光学神经网络的11.0 TERA-FLOP /第二光子卷积器

Tetris: Re-architecting Convolutional Neural Network Computation for Machine Learning Accelerators

摘要

著录项

相似文献

相关主题

期刊订阅