首页> 外文会议>Systems and Information Engineering Design Symposium >Exploring the use of adaptive gradient methods in effective deep learning systems
【24h】

Exploring the use of adaptive gradient methods in effective deep learning systems

机译:探索在有效的深度学习系统中使用自适应梯度方法

获取原文

摘要

Successful applications of Deep Learning have brought about breakthroughs in natural language understanding, speech recognition, and computer vision. One of the major challenges of designing powerful Deep Learning solutions for tasks such as image classification and text parsing, however, is the difficulty of training Deep Neural Networks (DNNs) properly. Recent research has raised serious doubts about the use of adaptive gradient methods, which have been popularized for running faster and requiring less parameter tuning than nonadaptive gradient methods. A recent study shows that adaptive gradient methods are worse than nonadaptive gradient methods in terms of training loss and test error. In this paper, we aim to revisit this problem, evaluating several nonadaptive and adaptive gradient methods including a recently-proposed adaptive gradient algorithm, AMSGrad, which seeks to solve some of the problems present in previous adaptive gradient methods. We focus on the benchmark MNIST optical character recognition task, one of the most widely-used in machine learning research, to investigate the differences in using adaptive gradient methods and nonadaptive gradient methods to train DNNs.
机译:深度学习的成功应用带来了自然语言理解,语音识别和计算机视觉方面的突破。然而,为诸如图像分类和文本解析之类的任务设计功能强大的深度学习解决方案的主要挑战之一是难以正确训练深度神经网络(DNN)。最近的研究对自适应梯度方法的使用提出了严重怀疑,自适应梯度方法已被广泛普及,与非自适应梯度方法相比,运行速度更快且所需的参数调整更少。最近的研究表明,就训练损失和测试误差而言,自适应梯度法比非自适应梯度法差。在本文中,我们旨在重新评估该问题,评估几种非自适应和自适应梯度方法,包括最近提出的自适应梯度算法AMSGrad,其目的是解决以前的自适应梯度方法中存在的一些问题。我们专注于基准MNIST光学字符识别任务(这是机器学习研究中使用最广泛的任务),以研究使用自适应梯度方法和非自适应梯度方法训练DNN的差异。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号