首页> 外文会议>ACM/ESDA/IEEE Design Automation Conference >SpWA: An Efficient Sparse Winograd Convolutional Neural Networks Accelerator on FPGAs
【24h】

SpWA: An Efficient Sparse Winograd Convolutional Neural Networks Accelerator on FPGAs

机译:SPAW:FPGA上有效的稀疏Winograd卷积神经网络加速器

获取原文

摘要

FPGAs have been an efficient accelerator for CNN inference due to its high performance, flexibility, and energy-efficiency. To improve the performance of CNNs on FPGAs, fast algorithms and sparse methods emerge as the most attractive alternatives, which can effectively reduce the complexity of CNNs. Using fast algorithms, the feature maps are transformed to special domain to reduce the arithmetic complexity. On the other hand, compressing CNN models by pruning the unimportant connections reduces both storage and arithmetic complexity. In this paper, we introduce sparse Winograd convolution accelerator (SpWA) combining these two orthogonal approaches on FPGAs. First, we employ a novel dataflow by rearranging the filter layout in Winograd convolution. Then we design an efficient architecture to implement SpWA using line buffer design and Compress-Sparse-Column (CSC) format-based processing element. Finally, we propose an efficient algorithm based on dynamic programming to balance the computation among different processing elements. Experimental results on VGG16 and YOLO network show a 2.9×~3.1× speedup compared with state-of-the-art technique.
机译:由于其高性能,灵活性和能效,FPGA是CNN推断的高效加速器。为了提高CNNS对FPGA的性能,快速算法和稀疏方法作为最具吸引力的替代方案,可以有效降低CNN的复杂性。使用快速算法,将特征映射转换为特殊域以降低算术复杂性。另一方面,通过修剪不重要的连接来压缩CNN模型可降低存储和算术复杂性。在本文中,我们介绍了在FPGA上结合这两个正交方法的稀疏Winograd卷积加速器(SPWA)。首先,我们通过重新排列Winograd卷积中的过滤器布局来使用新颖的数据流。然后,我们使用行缓冲器设计和压缩 - 稀疏列(CSC)基于格式的处理元素来设计一个有效的架构来实现SPWA。最后,我们提出了一种基于动态编程的高效算法,以平衡不同处理元件的计算。与最先进的技术相比,VGG16和YOLO网络的实验结果显示了2.9×〜3.1倍的加速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号