TD learning with neural networks

Norio Baba

首页> 外文期刊>Journal of robotics and mechatronics >TD learning with neural networks

【24h】

TD learning with neural networks

机译：用神经网络进行TD学习

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Temporal difference (TD) learning (TD learning), proposed by Sutton in the late 1980s, is very interesting prediction using obtained predictions for future prediction. Applying this learning to neural networks helps improve prediction performanceusing neural networks, after certain problems are solved.Major problems are as follows:1) Prediction P{sub}t at time t is assumed to be scalar in Sutton's original paper, raising the problem of "what is the rule for updating weight vector of the neural network if the neural network has multiple outputs?"2) How do we derive individual components of gradient vectorΔ{sub}wP{sub}t for weight vector w?This paper proposes how to handle these problems when TD learning is used in a neural network, focusing on the TD(0) algorithm, often used in TD learning. It proposes the rule for updating the neural network weight vector for a two-out neural networkunder problem 1) above, and explains the rule's validity. It then proposes computing every components of Δ{sub}wP{sub}t.

机译：萨顿（Sutton）在1980年代后期提出的时差（TD）学习（TD learning）是非常有趣的预测，使用获得的预测进行未来的预测。在解决某些问题后，将这种学习应用于神经网络有助于使用神经网络提高预测性能。主要问题如下：1）在萨顿的原始论文中，假设在时间t的预测P {sub} t是标量的，这引起了问题。 “如果神经网络有多个输出，更新神经网络的权向量的规则是什么？” 2）我们如何得出权向量w的梯度向量Δ{sub} wP {sub} t的各个分量？当在神经网络中使用TD学习时，为了解决这些问题，重点是TD学习中经常使用的TD（0）算法。提出了上面问题1）下的二进位神经网络的神经网络权向量更新规则，并解释了该规则的有效性。然后提出计算Δ{sub} wP {sub} t的每个分量。

著录项

来源
《Journal of robotics and mechatronics》 |1998年第4期|共6页
作者
Norio Baba;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类机器人技术;
关键词
Neural networks; TD-learning; Environmental change; Multiple outputs;

机译：神经网络;TD学习;环境变化;多种产出;

相似文献

外文文献
中文文献
专利

1. Neural-Fitted TD-Leaf Learning for Playing Othello With Structured Neural Networks [J] . van den Dries S., Wiering M. A. Neural Networks and Learning Systems, IEEE Transactions on . 2012,第11期

机译：用结构化神经网络玩奥赛罗的神经拟合TD叶学习
2. TD learning with neural networks [J] . Norio Baba Journal of robotics and mechatronics . 1998,第4期

机译：用神经网络进行TD学习
3. TD learning with neural networks [J] . Norio Baba Journal of robotics and mechatronics . 1998,第4期

机译：用神经网络进行TD学习
4. Construction of the sales prediction system utilizing neural networks the TD-learning method [C] . Norio Baba, Hidetsugu Suto システム制御情報学会研究発表講演会 . 1999

机译：利用神经网络和TD学习方法的销售预测系统的构建
5. Use of thermal desorption/gas chromatography/mass spectrometry (TD/GC/MS), honey bees, and artificial neural networks (ANN) in assessing ecosystem contamination. [D] . Alnasser, Ghassan Hamki. 1998

机译：使用热脱附/气相色谱/质谱（TD / GC / MS），蜜蜂和人工神经网络（ANN）评估生态系统污染。
6. Unfolding the Effects of Acute Cardiovascular Exercise on Neural Correlates of Motor Learning Using Convolutional Neural Networks [O] . Arna Ghosh, Fabien Dal Maso, Marc Roig, 2019

机译：展开急性心血管运动对利用卷积神经网络的神经相关性的影响
7. Utilization of artificial neural networks and the TD-learning method for constructing intelligent decision support systems [O] . N. Baba, H. Suto 2015

机译：人工神经网络和TD学习方法在智能决策支持系统构建中的应用

TD learning with neural networks

摘要

著录项

相似文献

相关主题

期刊订阅