首页> 外文期刊>Circuits and Systems I: Regular Papers, IEEE Transactions on >A Low-Power Deep Neural Network Online Learning Processor for Real-Time Object Tracking Application
【24h】

A Low-Power Deep Neural Network Online Learning Processor for Real-Time Object Tracking Application

机译:用于实时目标跟踪应用的低功耗深度神经网络在线学习处理器

获取原文
获取原文并翻译 | 示例
           

摘要

A deep neural network (DNN) online learning processor is proposed with high throughput and low power consumption to achieve real-time object tracking in mobile devices. Four key features enable a low-power DNN online learning. First, a proposed processor is designed with a unified core architecture and it achieves 1.33x higher throughput than the previous state-of-the-art DNN learning processor. Second, the new algorithms, binary feedback alignment (BFA), and dynamic fixed-point based run-length compression (RLC), are proposed and reduce power consumption through the reduction of external memory accesses (EMA). The BFA and dynamic fixed-point-based RLC reduce the EMA by 11.4% and 32.5%, respectively. Third, the new data feeding units, including an integral RLC (iRLC) decoder and a transpose RLC (tRLC) decoder, are co-designed to maximize throughput alongside the proposed algorithms. Finally, a dropout controller in this processor reduces redundant power consumption coming from the unified core and the data feeding architecture by the proposed dynamic clock-gating scheme. This enables the proposed processor to operate DNN online learning with 38.1% lower power consumption. Implemented with 65 nm CMOS technology, the 3.52 mm(2) DNN online learning processor shows 126 mW power consumption and the processor achieves 30.4 frames-per-second throughput in the object tracking application.
机译:提出了一种具有高吞吐量和低功耗的深度神经网络(DNN)在线学习处理器,以实现移动设备中的实时对象跟踪。四个关键功能可实现低功耗DNN在线学习。首先,建议的处理器采用统一的核心体系结构进行设计,与以前的最新DNN学习处理器相比,其吞吐量提高了1.33倍。其次,提出了新算法,二进制反馈对齐(BFA)和基于动态定点的行程长度压缩(RLC),并通过减少外部存储器访问(EMA)来降低功耗。 BFA和基于动态定点的RLC分别将EMA降低了11.4%和32.5%。第三,共同设计了新的数据馈送单元,包括集成的RLC(iRLC)解码器和转置RLC(tRLC)解码器,以与建议的算法一起使吞吐量最大化。最后,该处理器中的一个压差控制器通过提出的动态时钟门控方案减少了来自统一内核和数据馈送架构的冗余功耗。这使拟议的处理器能够以38.1%的低功耗运行DNN在线学习。 3.52 mm(2)DNN在线学习处理器采用65 nm CMOS技术实现,功耗为126 mW,在对象跟踪应用中,该处理器实现了每秒30.4帧的吞吐量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号