...
首页> 外文期刊>Scientific reports. >Exploring Feature Dimensions to Learn a New Policy in an Uninformed Reinforcement Learning Task
【24h】

Exploring Feature Dimensions to Learn a New Policy in an Uninformed Reinforcement Learning Task

机译:探索功能维度以在不知情的强化学习任务中学习新策略

获取原文
   

获取外文期刊封面封底 >>

       

摘要

When making a choice with limited information, we explore new features through trial-and-error to learn how they are related. However, few studies have investigated exploratory behaviour when information is limited. In this study, we address, at both the behavioural and neural level, how, when, and why humans explore new feature dimensions to learn a new policy for choosing a state-space. We designed a novel multi-dimensional reinforcement learning task to encourage participants to explore and learn new features, then used a reinforcement learning algorithm to model policy exploration and learning behaviour. Our results provide the first evidence that, when humans explore new feature dimensions, their values are transferred from the previous policy to the new online (active) policy, as opposed to being learned from scratch. We further demonstrated that exploration may be regulated by the level of cognitive ambiguity, and that this process might be controlled by the frontopolar cortex. This opens up new possibilities of further understanding how humans explore new features in an open-space with limited information.
机译:在信息有限的情况下进行选择时,我们会通过反复试验来探索新功能,以了解它们之间的关系。但是,很少有研究调查信息有限时的探索行为。在这项研究中,我们在行为和神经层面都研究了人类如何,何时以及为什么探索新的特征维度,以学习选择状态空间的新策略。我们设计了一种新颖的多维强化学习任务,以鼓励参与者探索和学习新功能,然后使用强化学习算法为政策探索和学习行为建模。我们的结果提供了第一个证据,即当人们探索新的特征尺寸时,其价值会从以前的策略转移到新的在线(主动)策略,而不是从头开始学习。我们进一步证明,探索可能受认知歧义程度的调节,而这一过程可能受额叶皮层控制。这为进一步了解人类如何在有限的信息空间中探索新功能开辟了新的可能性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号