首页> 外文期刊>Advances in complex systems >An empirical study of potential-based reward shaping and advice in complex, multi-agent systems
【24h】

An empirical study of potential-based reward shaping and advice in complex, multi-agent systems

机译:基于复杂多主体系统中基于潜力的奖励塑造和建议的实证研究

获取原文
获取原文并翻译 | 示例
           

摘要

This paper investigates the impact of reward shaping in multi-agent reinforcement learning as a way to incorporate domain knowledge about good strategies. In theory, potential-based reward shaping does not alter the Nash Equilibria of a stochastic game, only the exploration of the shaped agent. We demonstrate empirically the performance of reward shaping in two problem domains within the context of RoboCup KeepAway by designing three reward shaping schemes, encouraging specific behaviour such as keeping a minimum distance from other players on the same team and taking on specific roles. The results illustrate that reward shaping with multiple, simultaneous learning agents can reduce the time needed to learn a suitable policy and can alter the final group performance.
机译:本文研究了奖励塑造在多主体强化学习中的影响,这是一种结合有关良好策略的领域知识的方法。从理论上讲,基于势的奖励塑造不会改变随机博弈的纳什均衡,而只会改变对塑造者的探索。通过设计三种奖励塑造方案,鼓励特定行为,例如与同一团队中的其他玩家保持最小距离并扮演特定角色,我们通过RoboCup KeepAway的经验证明了在两个问题领域中奖励塑造的性能。结果表明,使用多个同时学习的学习者进行奖励塑造可以减少学习适当策略所需的时间,并且可以改变最终的团队绩效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号