Theoretical Analysis of Function of Derivative Term in On-Line Gradient Descent Learning

机译：在线梯度下降学习中衍生术语功能的理论分析

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In on-line gradient descent learning, the local property of the derivative term of the output can slow convergence. Improving the derivative term, such as by using the natural gradient, has been proposed for speeding up the convergence. Beside this sophisticated method, "simple method" that replace the derivative term with a constant has proposed and showed that this greatly increases convergence speed. Although this phenomenon has been analyzed empirically, however, theoretical analysis is required to show its generality. In this paper, we theoretically analyze the effect of using the simple method. Our results show that, with the simple method, the generalization error decreases faster than with the true gradient descent method when the learning step is smaller than optimum value η_(opt). When it is larger than η_(opt), it decreases slower with the simple method, and the residual error is larger than with the true gradient descent method. Moreover, when there is output noise, η_(opt) is no longer optimum; thus, the simple method is not robust in noisy circumstances.

机译：在在线梯度下降学习中，输出的衍生项的本地属性可以慢趋同。已经提出了改进衍生术语，例如通过使用自然梯度，以加速收敛。除了这种复杂的方法旁边，用恒定提出替换衍生术语的“简单方法”，并表明这大大增加了收敛速度。然而，这种现象已经经验分析，但是需要理论分析来展示其一般性。在本文中，我们理论上分析了使用简单方法的效果。我们的结果表明，通过简单的方法，当学习步骤小于最佳值η_（OPT）时，泛化误差比使用真实梯度下降方法更快。当它大于η_（opt）时，通过简单的方法减少较慢，并且剩余误差大于真正梯度下降方法。此外，当出现输出噪声时，η_（opt）不再最佳;因此，在嘈杂的情况下，简单的方法在不稳定。

著录项

来源
《International Conference on Artificial Neural Networks》|2012年||共8页
会议地点
作者
Kazuyuki Hara; Kentaro Katahira; Kazuo Okanoya; Masato Okada;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP3-53;
关键词

相似文献

外文文献
中文文献
专利

1. Theoretical analysis of batch and on-line training for gradient descent learning in neural networks [J] . Takehiko Nakama Neurocomputing . 2009,第1a3期

机译：神经网络中梯度下降学习的批量和在线训练的理论分析
2. Periodic step-size adaptation in second-order gradient descent for single-pass on-line structured learning [J] . Chun-Nan Hsu, Han-Shen Huang, Yu-Ming Chang, Machine Learning . 2009,第2a3期

机译：单次在线结构化学习的二阶梯度下降中的周期性步长自适应
3. Evaluation of adaptive gradient descent in on-line learning [J] . Masato Inoue, Hyeyoung Park, Masato Okada 電子情報通信学会技術研究報告. ニュ-ロコンピュ-ティング. Neurocomputing . 2002,第730期

机译：在线学习中自适应梯度下降的评估
4. Theoretical Analysis of Function of Derivative Term in On-Line Gradient Descent Learning [C] . Kazuyuki Hara, Kentaro Katahira, Kazuo Okanoya, International Conference on Artificial Neural Networks . 2012

机译：在线梯度下降学习中衍生术语功能的理论分析
5. Gradients and non-adiabatic derivative coupling terms for spin-orbit wavefunctions. [D] . Belcher, Lachlan T. 2011

机译：自旋轨道波函数的梯度和非绝热导数耦合项。
6. Graph Theoretical Analysis of Functional Brain Networks: Test-Retest Evaluation on Short- and Long-Term Resting-State Functional MRI Data [O] . Jin-Hui Wang, Xi-Nian Zuo, Suril Gohel, 2011

机译：功能性大脑网络的图论分析：短期和长期静息状态功能性MRI数据的重测评估
7. On-Line Learning Theory of Soft Committee Machines with Correlated Hidden Units - Steepest Gradient Descent and Natural Gradient Descent - [O] . Inoue, Masato, Park, Hyeyoung, Okada, Masato 2002

机译：具有相关性的软委员会机器在线学习理论隐藏单位 - 最陡的梯度下降和自然梯度下降 -
8. Gradients and Non-Adiabatic Derivative Coupling Terms for Spin-Orbit Wavefunctions [R] . Belcher, L. T. 2011

机译：自旋轨道波函数的梯度和非绝热导数耦合项

Theoretical Analysis of Function of Derivative Term in On-Line Gradient Descent Learning

摘要

著录项

相似文献

相关主题

期刊订阅