A Novel Memory-Scheduling Strategy for Large Convolutional Neural Network on Memory-Limited Devices

机译：内存受限设备上大卷积神经网络的一种新的内存调度策略

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recently, machine learning, especially deep learning, has been a core algorithm to be widely used in many fields such as natural language processing, speech recognition, object recognition, and so on. At the same time, another trend is that more and more applications are moved to wearable and mobile devices. However, traditional deep learning methods such as convolutional neural network (CNN) and its variants consume a lot of memory resources. In this case, these powerful deep learning methods are difficult to apply on mobile memory-limited platforms. In order to solve this problem, we present a novel memory-management strategy called mmCNN in this paper. With the help of this method, we can easily deploy a trained large-size CNN on any memory size platform such as GPU, FPGA, or memory-limited mobile devices. In our experiments, we run a feed-forward CNN process in some extremely small memory sizes (as low as 5 MB) on a GPU platform. The result shows that our method saves more than 98% memory compared to a traditional CNN algorithm and further saves more than 90% compared to the state-of-the-art related work “vDNNs” (virtualized deep neural networks). Our work in this paper improves the computing scalability of lightweight applications and breaks the memory bottleneck of using deep learning method on memory-limited devices.

机译：近年来，机器学习，尤其是深度学习，已经成为在自然语言处理，语音识别，对象识别等许多领域广泛使用的核心算法。同时，另一个趋势是越来越多的应用程序转移到可穿戴和移动设备上。但是，传统的深度学习方法（例如卷积神经网络（CNN）及其变体）会消耗大量内存资源。在这种情况下，这些功能强大的深度学习方法很难应用于受限于移动内存的平台。为了解决这个问题，我们提出一种新颖的内存管理策略，称为mmCNN。借助这种方法，我们可以轻松地在任何内存大小平台（例如GPU，FPGA或内存受限的移动设备）上部署经过训练的大型CNN。在我们的实验中，我们在GPU平台上以一些非常小的内存大小（低至5µMB）运行前馈CNN过程。结果表明，与传统的CNN算法相比，我们的方法可节省98％以上的内存，与最新的相关工作“ vDNNs”（虚拟化深度神经网络）相比，还可节省90％以上的内存。我们在本文中的工作提高了轻量级应用程序的计算可伸缩性，并打破了在内存受限设备上使用深度学习方法的内存瓶颈。

著录项

期刊名称 Computational Intelligence and Neuroscience
作者
Shijie Li; Xiaolong Shen; Yong Dou; Shice Ni; Jinwei Xu; Ke Yang; Qiang Wang; Xin Niu;
展开▼
作者单位

展开▼
年(卷),期 2019(2019),-1
年度 2019
页码 4328653
总页数 12
原文格式 PDF
正文语种
中图分类神经科学;
关键词

相似文献

外文文献
中文文献
专利

1. A Novel Memory-Scheduling Strategy for Large Convolutional Neural Network on Memory-Limited Devices [J] . Shijie Li, Xiaolong Shen, Yong Dou, Computational intelligence and neuroscience . 2019,第4期

机译：关于内存限制设备上大型卷积神经网络的一种新的存储器调度策略
2. Parallel Recurrent Convolutional Neural Networks-Based Music Genre Classification Method for Mobile Devices [J] . Yang Rui, Feng Lin, Wang Huibing, Quality Control, Transactions . 2020,第期

机译：基于循环的卷积性神经网络的移动设备的音乐类型分类方法
3. O⁴-DNN: A Hybrid DSP-LUT-Based Processing Unit With Operation Packing and Out-of-Order Execution for Efficient Realization of Convolutional Neural Networks on FPGA Devices [J] . Haghi Pouya, Kamal Mehdi, Afzali-Kusha Ali, IEEE transactions on circuits and systems . I , Regular papers . 2020,第9期

机译：O⁴-DNN：一种基于混合DSP-LUT的处理单元，具有操作包装和超出执行，以便在FPGA设备上有效地实现卷积神经网络
4. mmCNN: A Novel Method for Large Convolutional Neural Network on Memory-Limited Devices [C] . Shijie Li, Yong Dou, Jinwei Xu, IEEE Annual Computer Software and Applications Conference . 2018

机译：mmCNN：内存受限设备上的大卷积神经网络的一种新方法
5. FPGA-based Accelerators for Convolutional Neural Networks on Embedded Devices [D] . Perera Miro, Jordi. 2020

机译：基于FPGA的嵌入式设备卷积神经网络的加速器
6. Training Deep Convolutional Neural Networks with Resistive Cross-Point Devices [O] . Tayfun Gokmen, Murat Onen, Wilfried Haensch 2017

机译：用电阻性交叉点设备训练深度卷积神经网络
7. A Novel Memory-Scheduling Strategy for Large Convolutional Neural Network on Memory-Limited Devices [O] . Shijie Li, Xiaolong Shen, Yong Dou, 2019

机译：关于内存限制设备上大型卷积神经网络的一种新的存储器调度策略

A Novel Memory-Scheduling Strategy for Large Convolutional Neural Network on Memory-Limited Devices

摘要

著录项

相似文献

相关主题

期刊订阅