首页>
外国专利>
CONTROLLING AN AGENT TO EXPLORE AN ENVIRONMENT USING OBSERVATION LIKELIHOODS
CONTROLLING AN AGENT TO EXPLORE AN ENVIRONMENT USING OBSERVATION LIKELIHOODS
展开▼
机译:控制代理商以探索使用观察可能性的环境
展开▼
页面导航
摘要
著录项
相似文献
摘要
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for controlling an agent. One of the methods includes, while training a neural network used to control the agent, generating a reward value for the training as a measure of the divergence between the likelihood of the further observation under first and second statistical models of the environment, the first statistical model and second model being based on respective first and second histories of past observations and actions, the most recent observation in the first history being more recent than the most recent observation in the second history.
展开▼