首页> 外文会议>Annual Conference on Information Sciences and Systems >MSE-Optimal Neural Network Initialization via Layer Fusion
【24h】

MSE-Optimal Neural Network Initialization via Layer Fusion

机译:通过层融合的MSE最优神经网络初始化

获取原文

摘要

Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks. However, the use of stochastic gradient descent combined with the nonconvexity of the underlying optimization problems renders parameter learning susceptible to initialization. To address this issue, a variety of methods that rely on random parameter initialization or knowledge distillation have been proposed in the past. In this paper, we propose FuseInit, a novel method to initialize shallower networks by fusing neighboring layers of deeper networks that are trained with random initialization. We develop theoretical results and efficient algorithms for mean-square error (MSE)- optimal fusion of neighboring dense-dense, convolutional-dense, and convolutional-convolutional layers. We show experiments for a range of classification and regression datasets, which suggest that deeper neural networks are less sensitive to initialization and shallower networks can perform better (sometimes as well as their deeper counterparts) if initialized with FuseInit.
机译:深度神经网络可实现一系列分类和推理任务的最新性能。但是,随机梯度下降的使用与基础优化问题的不凸性相结合,使参数学习容易初始化。为了解决这个问题,过去已经提出了多种依赖于随机参数初始化或知识提炼的方法。在本文中,我们提出了FuseInit,这是一种通过融合经过随机初始化训练的较深网络的相邻层来初始化较浅网络的新方法。我们开发了均方误差(MSE)的理论结果和有效算法-相邻的密集密集层,卷积密集层和卷积卷积层的最佳融合。我们展示了一系列分类和回归数据集的实验,这些实验表明,如果使用FuseInit进行初始化,则较深的神经网络对初始化的敏感性较低,而较浅的网络则可以表现更好(有时以及较深的神经网络)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号