首页> 外文会议>International conference on artificial neural networks >Learning from Delayed Reward und Punishment in a Spiking Neural Network Model of Basal Ganglia with Opposing D1/D2 Plasticity
【24h】

Learning from Delayed Reward und Punishment in a Spiking Neural Network Model of Basal Ganglia with Opposing D1/D2 Plasticity

机译:在具有相反D1 / D2可塑性的基础神经节突刺神经网络模型中从延迟奖励和惩罚中学习

获取原文

摘要

Extending previous work, we introduce a spiking actor-critic network model of learning from reward and punishment in the basal ganglia. In the model, the striatum is taken to be segregated into populations of medium spiny neurons (MSNs) that carry either Dl or D2 dopamine receptor type. This segregation allows explicit representation of both positive and negative expected outcome within the respective population. In line with recent experiments, we further assume that Dl and D2 MSN populations have opposing dopamine-modulated bidirectional synaptic plasticity. Experiments were conducted in a grid world, where a moving agent had to reach a remote rewarded goal state. The network learned not only to approach the rewarded goal, but also to consequently avoid punishments as opposed to the previous model. The spiking network model explains functional role of D1/D2 MSN segregation within striatum, specifically the reversed direction of dopamine-dependent plasticity found at synapses converging on different MSNs.
机译:在扩展以前的工作的同时,我们介绍了一种尖峰的行为者-批评者网络模型,该模型可从基底神经节的奖惩中学习。在该模型中,纹状体被分离成带有D1或D2多巴胺受体类型的中棘状神经元(MSN)的群体。这种隔离允许在各个人群中明确表示积极和消极的预期结果。与最近的实验一致,我们进一步假设D1和D2MSN群体具有相反的多巴胺调节的双向突触可塑性。实验是在网格世界中进行的,在网格世界中,移动的代理必须达到远程获得奖励的目标状态。网络不仅学会了实现奖励目标,而且还避免了与先前模型相反的惩罚。尖峰网络模型解释了纹状体中D1 / D2 MSN分离的功能作用,特别是在聚合在不同MSN上的突触中发现的多巴胺依赖性可塑性反转的方向。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号