首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Neural Taylor Approximations: Convergence and Exploration in Rectifier Networks
【24h】

Neural Taylor Approximations: Convergence and Exploration in Rectifier Networks

机译:神经泰勒近似:整流器网络的收敛和探索

获取原文
       

摘要

Modern convolutional networks, incorporating rectifiers and max-pooling, are neither smooth nor convex; standard guarantees therefore do not apply. Nevertheless, methods from convex optimization such as gradient descent and Adam are widely used as building blocks for deep learning algorithms. This paper provides the first convergence guarantee applicable to modern convnets, which furthermore matches a lower bound for convex nonsmooth functions. The key technical tool is the neural Taylor approximation – a straightforward application of Taylor expansions to neural networks – and the associated Taylor loss. Experiments on a range of optimizers, layers, and tasks provide evidence that the analysis accurately captures the dynamics of neural optimization. The second half of the paper applies the Taylor approximation to isolate the main difficulty in training rectifier nets – that gradients are shattered – and investigates the hypothesis that, by exploring the space of activation configurations more thoroughly, adaptive optimizers such as RMSProp and Adam are able to converge to better solutions.
机译:包含整流器和最大池的现代卷积网络既不平滑也不凸。因此,标准担保不适用。尽管如此,凸优化的方法(例如梯度下降和Adam)仍被广泛用作深度学习算法的基础。本文提供了适用于现代卷积的第一个收敛性保证,并且进一步匹配了凸非光滑函数的下界。关键技术工具是神经泰勒逼近-将泰勒展开式直接应用到神经网络-以及相关的泰勒损失。在一系列优化器,层和任务上进行的实验提供了证据,表明该分析可以准确地捕获神经优化的动态。本文的后半部分采用泰勒(Taylor)逼近法来分离训练整流器网络的主要困难-梯度被粉碎-并研究了以下假设:通过更彻底地探索激活配置的空间,自适应优化器(如RMSProp和Adam)能够寻求更好的解决方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号