首页> 外文会议>IEEE/WIC/ACM International Conference on Intelligent Agent Technology >Combining Dynamic Reward Shaping and Action Shaping for Coordinating Multi-agent Learning
【24h】

Combining Dynamic Reward Shaping and Action Shaping for Coordinating Multi-agent Learning

机译:结合动态奖励塑形和动作塑形来协调多主体学习

获取原文

摘要

Coordinating multi-agent reinforcement learning provides a promising approach to scaling learning in large cooperative multi-agent systems. It allows agents to learn local decision policies based on their local observations and rewards, and, meanwhile, coordinates agents' learning processes to ensure the global learning performance. One key question is that how coordination mechanisms impact learning algorithms so that agents' learning processes are guided and coordinated. This paper presents a new shaping approach that effectively integrates coordination mechanisms into local learning processes. This shaping approach uses two-level agent organization structures and combines reward shaping and action shaping. The higher-level agents dynamically and periodically produce the shaping heuristic knowledge based on the learning status of the lower-level agents. The lower-level agents then uses this knowledge to coordinate their local learning processes with other agents. Experimental results show our approach effectively speeds up the convergence of multi-agent learning in large systems.
机译:协调多主体强化学习为扩展大型协作多主体系统中的学习提供了一种有前途的方法。它使代理可以根据他们的本地观察和奖励来学习本地决策策略,同时可以协调代理的学习过程以确保全局学习效果。一个关键问题是,协调机制如何影响学习算法,从而指导和协调座席的学习过程。本文提出了一种新的塑造方法,可以有效地将协调机制整合到本地学习过程中。这种整形方法使用两级代理人组织结构,并将奖励整形和动作整形相结合。较高级别的代理根据较低级别的代理的学习状态动态并定期生成成形启发式知识。然后,较低级别的座席使用此知识来与其他座席协调他们的本地学习过程。实验结果表明,我们的方法有效地加快了大型系统中多智能体学习的收敛速度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号