Theoretical analysis of batch and on-line training for gradient descent learning in neural networks

Takehiko Nakama

首页> 外文期刊>Neurocomputing >Theoretical analysis of batch and on-line training for gradient descent learning in neural networks

【24h】

Theoretical analysis of batch and on-line training for gradient descent learning in neural networks

机译：神经网络中梯度下降学习的批量和在线训练的理论分析

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this study, we theoretically analyze two essential training schemes for gradient descent learning in neural networks: batch and on-line training. The convergence properties of the two schemes applied to quadratic loss functions are analytically investigated. We quantify the convergence of each training scheme to the optimal weight using the absolute value of the expected difference (Measure 1) and the expected squared difference (Measure 2) between the optimal weight and the weight computed by the scheme. Although on-line training has several advantages over batch training with respect to the first measure, it does not converge to the optimal weight with respect to the second measure if the variance of the per-instance gradient remains constant. However, if the variance decays exponentially, then online training converges to the optimal weight with respect to Measure 2. Our analysis reveals the exact degrees to which the training set size, the variance of the per-instance gradient, and the learning rate affect the rate of convergence for each scheme.

机译：在这项研究中，我们从理论上分析了神经网络中梯度下降学习的两种基本训练方案：批量训练和在线训练。分析了应用于二次损失函数的两种方案的收敛性质。我们使用最佳权重和该方案计算的权重之间的期望差的绝对值（度量1）和期望平方差（度量2）的绝对值，量化每个训练方案对最优权重的收敛。尽管就第一种方法而言，在线训练比批量训练具有若干优势，但如果按实例坡度的方差保持恒定，则相对于第二种方法，在线训练不会收敛到最佳权重。但是，如果方差呈指数衰减，则在线训练将相对于度量2收敛到最佳权重。我们的分析揭示了训练集大小，每个实例梯度的方差和学习率会影响训练的确切程度。每个方案的收敛速度。

著录项

来源
《Neurocomputing》 |2009年第3期|151-159|共9页
作者
Takehiko Nakama;
展开▼
作者单位

Department of Applied Mathematics and Statistics, The Johns Hopkins University, 3400 N. Charles Street, Baltimore, MD 21218, USA;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
neural networks; cradient descent learning; batch training; on-line training; quadratic loss functions; convergence;

机译：神经网络;梯度下降学习;批量培训;在线培训;二次损失函数;收敛;

相似文献

外文文献
中文文献
专利

1. Analysis of gradient descent learning algorithms for multilayer feedforward neural networks [J] . Guo H., Gelfand S.B. IEEE Transactions on Circuits and Systems . 1991,第8期

机译：多层前馈神经网络的梯度下降学习算法分析
2. FUZZY RULE EXTRACTION FROM A FEED FORWARD NEURAL NETWORK BY TRAINING A REPRESENTATIVE FUZZY NEURAL NETWORK USING GRADIENT DESCENT [J] . ROELOF K. BROUWER International Journal of Uncertainty, Fuzziness, and Knowledge-based Systems . 2005,第6期

机译：通过使用梯度下降训练代表性的模糊神经网络，从前馈神经网络中提取模糊规则
3. Lyapunov Stability Analysis of Gradient Descent-Learning Algorithm in Network Training [J] . AhmadBanakar International Scholarly Research Notices . 2011,第6期

机译：网络训练中梯度下降学习算法的Lyapunov稳定性分析
4. Theoretical Analysis of Function of Derivative Term in On-Line Gradient Descent Learning [C] . Kazuyuki Hara, Kentaro Katahira, Kazuo Okanoya, International Conference on Artificial Neural Networks . 2012

机译：在线梯度下降学习中衍生术语功能的理论分析
5. Statistical Learning with Neural Networks Trained by Gradient Descent [D] . Frei, Spencer. 2021

机译：与梯度下降训练的神经网络统计学习
6. Mutual Information Based Learning Rate Decay for Stochastic Gradient Descent Training of Deep Neural Networks [O] . Shrihari Vasudevan 2020

机译：基于互动信息的学习速率衰减用于深神经网络的随机梯度血统训练
7. Mutual Information Based Learning Rate Decay for Stochastic Gradient Descent Training of Deep Neural Networks [O] . Shrihari Vasudevan 2020

机译：基于互动信息的学习速率衰减，用于深神经网络的随机梯度血统训练

Theoretical analysis of batch and on-line training for gradient descent learning in neural networks

摘要

著录项

相似文献

相关主题

期刊订阅