首页> 外国专利> CONTROLLING AN AGENT TO EXPLORE AN ENVIRONMENT USING OBSERVATION LIKELIHOODS

CONTROLLING AN AGENT TO EXPLORE AN ENVIRONMENT USING OBSERVATION LIKELIHOODS

机译:控制代理商以探索使用观察可能性的环境

摘要

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for controlling an agent. One of the methods includes, while training a neural network used to control the agent, generating a reward value for the training as a measure of the divergence between the likelihood of the further observation under first and second statistical models of the environment, the first statistical model and second model being based on respective first and second histories of past observations and actions, the most recent observation in the first history being more recent than the most recent observation in the second history.
机译:方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于控制代理。 其中一个方法包括,同时训练用于控制代理的神经网络,在第一和第二统计模型下进一步观察的可能性之间产生培训的奖励价值,为培训产生培训的奖励价值,这是第一个统计 模型和第二模型基于过去观测和行动的各个第一和第二历史,在第一历史中最近的观察结果更近于第二历史中最近的观察。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号