...
首页> 外文期刊>The Journal of Artificial Intelligence Research >Communication-Based Decomposition Mechanisms for Decentralized MDPs
【24h】

Communication-Based Decomposition Mechanisms for Decentralized MDPs

机译:分散MDP的基于通信的分解机制

获取原文
获取原文并翻译 | 示例
           

摘要

Multi-agent planning in stochastic environments can be framed formally as a decentralized Markov decision problem. Many real-life distributed problems that arise in manufacturing, multi-robot coordination and information gathering scenarios can be formalized using this framework. However, finding the optimal solution in the general case is hard, limiting the applicability of recently developed algorithms. This paper provides a practical approach for solving decentralized control problems when communication among the decision makers is possible, but costly. We develop the notion of communication-based mechanism that allows us to decompose a decentralized MDP into multiple single-agent problems. In this framework, referred to as decentralized semi-Markov decision process with direct communication (Dec-SMDP-Com), agents operate separately between communications. We show that finding an optimal mechanism is equivalent to solving optimally a Dec-SMDP-Com. We also provide a heuristic search algorithm that converges on the optimal decomposition. Restricting the decomposition to some specific types of local behaviors reduces significantly the complexity of planning. In particular, we present a polynomial- time algorithm for the case in which individual agents perform goal-oriented behaviors between communications. The paper concludes with an additional tractable algorithm that enables the introduction of human knowledge, thereby reducing the overall problem to finding the best time to communicate. Empirical results show that these approaches provide good approximate solutions.
机译:可以将随机环境中的多主体规划正式定义为分散的马尔可夫决策问题。使用此框架可以将制造,多机器人协调和信息收集场景中出现的许多现实生活中的分布式问题正式化。但是,在一般情况下很难找到最佳解决方案,这限制了最近开发的算法的适用性。当决策者之间可以进行通讯但成本很高时,本文提供了一种解决分散控制问题的实用方法。我们开发了基于通信的机制的概念,该概念使我们可以将分散的MDP分解为多个单主体问题。在此框架中,称为直接通信的分散式半马尔可夫决策过程(Dec-SMDP-Com),代理在通信之间分别进行操作。我们表明,找到最优机制等同于最优地解决Dec-SMDP-Com。我们还提供了一种收敛于最优分解的启发式搜索算法。将分解限制在某些特定类型的本地行为中,可以大大降低计划的复杂性。特别是,对于单个代理在通信之间执行面向目标的行为的情况,我们提出了多项式时间算法。本文以附加的易处理算法作为结束语,该算法能够引入人类知识,从而减少总体问题,从而找到最佳的交流时间。实证结果表明,这些方法提供了良好的近似解决方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号