首页> 外文会议>20th International Conference on Machine Learning >The Significance of Temporal-Difference Learning in Self-Play Training TD-rummy versus EVO-rummy
【24h】

The Significance of Temporal-Difference Learning in Self-Play Training TD-rummy versus EVO-rummy

机译:时差学习在自学训练中TD-rummy与EVO-rummy的意义

获取原文

摘要

Reinforcement learning has been used for training game playing agents. The value function for a complex game must be approximated with a continuous function because the number of states becomes too large to enumerate. Temporal-difference learning with self-play is one method successfully used to derive the value approximation function. Co-evolution of the value function is also claimed to yield good results. This paper reports on a direct comparison between an agent trained to play gin rummy using temporal difference learning, and the same agent trained with co-evolution. Co-evolution produced superior results.
机译:强化学习已用于培训游戏代理商。复杂游戏的价值函数必须用连续函数来近似,因为状态数变得太大而无法枚举。具有自我演奏的时差学习是成功用于推导值逼近函数的一种方法。还声称值函数的共同演化会产生良好的结果。本文报告了使用时差学习训练过的玩杜松子酒的特工与通过共同进化训练过的同一个特工之间的直接比较。协同进化产生了卓越的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号