首页> 外文会议>Annual conference on Neural Information Processing Systems >Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks
【24h】

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

机译:权重归一化:一个简单的重新参数化以加快深度神经网络的训练

获取原文

摘要

We present weight normalization: a reparameterization of the weight vectors in a neural network that decouples the length of those weight vectors from their direction. By reparameterizing the weights in this way we improve the conditioning of the optimization problem and we speed up convergence of stochastic gradient descent. Our reparameterization is inspired by batch normalization but does not introduce any dependencies between the examples in a minibatch. This means that our method can also be applied successfully to recurrent models such as LSTMs and to noise-sensitive applications such as deep reinforcement learning or generative models, for which batch normalization is less well suited. Although our method is much simpler, it still provides much of the speed-up of full batch normalization. In addition, the computational overhead of our method is lower, permitting more optimization steps to be taken in the same amount of time. We demonstrate the usefulness of our method on applications in supervised image recognition, generative modelling, and deep reinforcement learning.
机译:我们提出了权重归一化:神经网络中权重向量的重新参数化,将这些权重向量的长度与其方向解耦。通过以这种方式重新设置权重,我们改善了优化问题的条件,并加快了随机梯度下降的收敛速度。我们的重新参数化受到批处理规范化的启发,但没有在微型批处理中引入示例之间的任何依存关系。这意味着我们的方法还可以成功地应用于递归模型(例如LSTM)以及对噪声敏感的应用(例如深度强化学习或生成模型),而批量归一化不太适合此类方法。尽管我们的方法要简单得多,但它仍可提供完整批次标准化的许多加速。此外,我们方法的计算开销较低,从而允许在相同的时间内采取更多的优化步骤。我们展示了我们的方法在监督图像识别,生成建模和深度强化学习中的应用的有用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号