首页> 外文会议>International Conference on Automated Planning and Scheduling >History-Based Controller Design and Optimization for Partially Observable MDPs
【24h】

History-Based Controller Design and Optimization for Partially Observable MDPs

机译:基于历史的控制器设计和优化部分可观察到的MDP

获取原文

摘要

Partially observable MDPs provide an elegant framework for sequential decision making. Finite-state controllers (FSCs) are often used to represent policies for infinite-horizon problems as they offer a compact representation, simple-to-execute plans, and adjustable tradeoff between computational complexity and policy size. We develop novel connections between optimizing FSCs for POMDPs and the dual linear program for MDPs. Building on that, we present a dual mixed integer linear program (MIP) for optimizing FSCs. To assign well-defined meaning to FSC nodes as well as aid in policy search, we show how to associate history-based features with each FSC node. Using this representation, we address another challenging problem, that of iteratively deciding which nodes to add to FSC to get a better policy. Using an efficient off-the-shelf MIP solver, we show that this new approach can find compact near-optimal FSCs for several large benchmark domains, and is competitive with previous best approaches.
机译:部分可观察到的MDP为连续决策提供了优雅的框架。有限状态控制器(FSCS)通常用于表示无限范围问题的策略,因为它们提供了紧凑的表示,简单的计划,以及计算复杂性和政策规模之间的可调权衡。我们在优化POMDPS和MDP的双线性程序之间开发新的联系。建立在此方面,我们介绍了一个用于优化FSC的双混合整数线性程序(MIP)。为FSC节点分配明确定义的含义以及援助策略搜索,我们展示了如何将基于历史的功能与每个FSC节点相关联。使用此表示,我们解决了另一个具有挑战性的问题,即迭代地决定要添加到FSC以获得更好的政策的节点。使用高效的现成MIP求助阀,我们表明,这种新方法可以找到几个大型基准域的紧凑近最佳FSC,并且与以前的最佳方法具有竞争力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号