首页> 美国卫生研究院文献>Frontiers in Behavioral Neuroscience >Modeling choice and reaction time during arbitrary visuomotor learning through the coordination of adaptive working memory and reinforcement learning
【2h】

Modeling choice and reaction time during arbitrary visuomotor learning through the coordination of adaptive working memory and reinforcement learning

机译:通过自适应工作记忆和强化学习的协调对任意视觉运动学习中的选择和反应时间建模

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Current learning theory provides a comprehensive description of how humans and other animals learn, and places behavioral flexibility and automaticity at heart of adaptive behaviors. However, the computations supporting the interactions between goal-directed and habitual decision-making systems are still poorly understood. Previous functional magnetic resonance imaging (fMRI) results suggest that the brain hosts complementary computations that may differentially support goal-directed and habitual processes in the form of a dynamical interplay rather than a serial recruitment of strategies. To better elucidate the computations underlying flexible behavior, we develop a dual-system computational model that can predict both performance (i.e., participants' choices) and modulations in reaction times during learning of a stimulus–response association task. The habitual system is modeled with a simple Q-Learning algorithm (QL). For the goal-directed system, we propose a new Bayesian Working Memory (BWM) model that searches for information in the history of previous trials in order to minimize Shannon entropy. We propose a model for QL and BWM coordination such that the expensive memory manipulation is under control of, among others, the level of convergence of the habitual learning. We test the ability of QL or BWM alone to explain human behavior, and compare them with the performance of model combinations, to highlight the need for such combinations to explain behavior. Two of the tested combination models are derived from the literature, and the latter being our new proposal. In conclusion, all subjects were better explained by model combinations, and the majority of them are explained by our new coordination proposal.
机译:当前的学习理论全面描述了人类和其他动物的学习方式,并将行为的灵活性和自动性置于适应行为的核心。然而,对于目标导向系统和习惯性决策系统之间相互作用的计算仍知之甚少。先前的功能磁共振成像(fMRI)结果表明,大脑进行的补充计算可能以动态相互作用而不是一系列策略的形式差异性地支持目标定向和习惯过程。为了更好地阐明柔性行为的基础计算,我们开发了一种双系统计算模型,该模型可以在学习刺激-响应关联任务期间预测性能(即参与者的选择)和反应时间的调节。习惯性系统使用简单的Q学习算法(QL)进行建模。对于目标导向系统,我们提出了一个新的贝叶斯工作记忆(BWM)模型,该模型在以前的试验历史中搜索信息,以使Shannon熵最小。我们提出了QL和BWM协调的模型,以便使昂贵的内存操作受习惯学习的收敛水平等控制。我们仅测试QL或BWM解释人类行为的能力,并将其与模型组合的性能进行比较,以强调需要使用此类组合来解释行为。其中两个经过测试的组合模型来自文献,后者是我们的新建议。总之,通过模型组合可以更好地说明所有主题,而通过我们的新协调建议可以更好地说明大多数主题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号