首页>
外国专利>
APPARATUS AND METHOD OF POLICY MODELING BASED ON PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES
APPARATUS AND METHOD OF POLICY MODELING BASED ON PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES
展开▼
机译:基于部分可观察的马尔可夫决策过程的政策建模装置和方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
PURPOSE: An apparatus and a method of modeling a policy based on partially observable Markov decision processes are provided to apply an ADD(Algebraic Decision Diagram) to calculate the upper border and the lower border of HSVI(Heuristic Search Value Iteration), thereby reducing policy training time. CONSTITUTION: A problem defining unit(310) defines a problem by a mathematical parameter. A border calculating unit(320) calculates upper/lower borders by applying an ADD. A policy training unit(330) determines an action by a partial Markov decision process to train a policy. A policy outputting unit(340) offers the policy.
展开▼