首页>
外国专利>
UPDATING POLICY PARAMETERS UNDER MARKOV DECISION PROCESS SYSTEM ENVIRONMENT
UPDATING POLICY PARAMETERS UNDER MARKOV DECISION PROCESS SYSTEM ENVIRONMENT
展开▼
机译:马尔可夫决策过程系统环境下的更新政策参数
展开▼
页面导航
摘要
著录项
相似文献
摘要
Embodiments relate to updating a parameter defining a policy under a Markov decision process system environment. An aspect includes updating the policy parameter stored in a storage section of a controller according to an update equation. The update equation includes a term for decreasing a weighted sum of expected hitting times over a first state (s) and a second state (s′) of a statistic on the number of steps required to make a first state transition from the first state (s) to the second state (s′).
展开▼