首页> 外文期刊>電子情報通信学会技術研究報告. ニュ-ロコンピュ-ティング. Neurocomputing >A multi-agent reinforcement learning method with learning of other agents for competitive game
【24h】

A multi-agent reinforcement learning method with learning of other agents for competitive game

机译:一种多智能体强化学习方法,结合其他智能体进行竞技游戏学习

获取原文
获取原文并翻译 | 示例
           

摘要

This report proposes a reinforcement learning (RL) method based on the Actor-Critic architecture, which can be applied to partially-observable multi-agent competitive games. As an example, we consider a card game "Hearts". The RL then becomes a partially-observable Markov decision process (POMDP). In our method, a single Hearts game is divided into three stages, and three actors are prepared so that one of them plays and learns separately in each stage. In particular, the actor for the middle stage plays so as to enlarge the expected temporal-difference error, which is calculated using the evaluation function approximated by the critic and the estimated state transition. Computer experiments with heuristic players show that our RL method works well.
机译:本报告提出了一种基于Actor-Critic体系结构的强化学习(RL)方法,该方法可应用于部分可观察到的多主体竞争性游戏。例如,我们考虑一个纸牌游戏“心”。 RL随后成为部分可观察到的马尔可夫决策过程(POMDP)。在我们的方法中,将单个Hearts游戏划分为三个阶段,并准备了三个演员,以便其中一个在每个阶段分别扮演和学习。特别地,用于中间阶段的演员进行表演以扩大预期的时差误差,该误差是使用评论者近似的评估函数和估计的状态转换来计算的。启发式播放器的计算机实验表明,我们的RL方法效果很好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号