Learning from Delayed Reward und Punishment in a Spiking Neural Network Model of Basal Ganglia with Opposing D1/D2 Plasticity

机译：在具有相反D1 / D2可塑性的基础神经节突刺神经网络模型中从延迟奖励和惩罚中学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Extending previous work, we introduce a spiking actor-critic network model of learning from reward and punishment in the basal ganglia. In the model, the striatum is taken to be segregated into populations of medium spiny neurons (MSNs) that carry either Dl or D2 dopamine receptor type. This segregation allows explicit representation of both positive and negative expected outcome within the respective population. In line with recent experiments, we further assume that Dl and D2 MSN populations have opposing dopamine-modulated bidirectional synaptic plasticity. Experiments were conducted in a grid world, where a moving agent had to reach a remote rewarded goal state. The network learned not only to approach the rewarded goal, but also to consequently avoid punishments as opposed to the previous model. The spiking network model explains functional role of D1/D2 MSN segregation within striatum, specifically the reversed direction of dopamine-dependent plasticity found at synapses converging on different MSNs.

机译：在扩展以前的工作的同时，我们介绍了一种尖峰的行为者-批评者网络模型，该模型可从基底神经节的奖惩中学习。在该模型中，纹状体被分离成带有D1或D2多巴胺受体类型的中棘状神经元（MSN）的群体。这种隔离允许在各个人群中明确表示积极和消极的预期结果。与最近的实验一致，我们进一步假设D1和D2MSN群体具有相反的多巴胺调节的双向突触可塑性。实验是在网格世界中进行的，在网格世界中，移动的代理必须达到远程获得奖励的目标状态。网络不仅学会了实现奖励目标，而且还避免了与先前模型相反的惩罚。尖峰网络模型解释了纹状体中D1 / D2 MSN分离的功能作用，特别是在聚合在不同MSN上的突触中发现的多巴胺依赖性可塑性反转的方向。

著录项

来源
《International conference on artificial neural networks》|2012年|459-466|共8页
会议地点
作者
Jenia Jitsev; Nobi Abraham; Abigail Morrison; Marc Tittgemeyer;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Dopamine-modulated plasticity; bidirectional plasticity; basal ganglia; prediction error signal; actor-critic architecture; spiking neural network; temporal difference learning; D1/D2 receptors;

机译：多巴胺调节的可塑性;双向可塑性基底神经节预测误差信号;演员批评体系尖峰神经网络;时间差异学习; D1 / D2受体;

相似文献

外文文献
中文文献
专利

1. An extended reinforcement learning model of basal ganglia to understand the contributions of serotonin and dopamine in risk-based decision making, reward prediction, and punishment learning [J] . Balasubramani, Pragathi P. Frontiers in Computational Neuroscience . 2014,第4期

机译：扩展的基底神经节强化学习模型，以了解5-羟色胺和多巴胺在基于风险的决策，奖励预测和惩罚学习中的作用
2. A network model of basal ganglia for understanding the roles of dopamine and serotonin in reward-punishment-risk based decision making [J] . Pragathi P. Balasubramani, V. Srinivasa Chakravarthy, Balaraman Ravindran, Frontiers in Computational Neuroscience . 2015,第1期

机译：基底神经节的网络模型，用于了解多巴胺和5-羟色胺在基于奖惩风险的决策中的作用
3. A network model of basal ganglia for understanding the roles of dopamine and serotonin in reward-punishment-risk based decision making [J] . Balasubramani, Pragathi P. Frontiers in Computational Neuroscience . 2015,第4期

机译：基底神经节的网络模型，用于了解多巴胺和5-羟色胺在基于奖惩风险的决策中的作用
4. Learning from Delayed Reward und Punishment in a Spiking Neural Network Model of Basal Ganglia with Opposing D1/D2 Plasticity [C] . Jenia Jitsev, Nobi Abraham, Abigail Morrison, International Conference on Artificial Neural Networks . 2012

机译：在基于Ganglia的尖端神经网络模型中学习延迟奖励缺乏惩罚，具有相反的D1 / D2可塑性
5. The Role of Short-Term Synaptic Plasticity in Neural Network Spiking Dynamics and in the Learning of Multiple Distal Rewards. [D] . O'Brien, Michael John. 2013

机译：短期突触可塑性在神经网络突增动力学和多重远距学习中的作用。
6. An extended reinforcement learning model of basal ganglia to understand the contributions of serotonin and dopamine in risk-based decision making reward prediction and punishment learning [O] . Pragathi P. Balasubramani, V. Srinivasa Chakravarthy, Balaraman Ravindran, 2014

机译：扩展的基底神经节强化学习模型以了解5-羟色胺和多巴胺在基于风险的决策奖励预测和惩罚学习中的作用
7. AN EXTENDED REINFORCEMENT LEARNING MODEL OF BASAL GANGLIA TO UNDERSTAND THE CONTRIBUTIONS OF SEROTONIN AND DOPAMINE IN RISK-BASED DECISION MAKING, REWARD PREDICTION, AND PUNISHMENT LEARNING [O] . Pragathi Priyadharsini Balasubramani, Srinivasa eChakravarthy, Ravindran eBalaraman, 2014

机译：基于风险的决策，奖励预测和惩罚学习中基础神经节的扩展学习模型，以了解血清素和多巴胺的贡献

Learning from Delayed Reward und Punishment in a Spiking Neural Network Model of Basal Ganglia with Opposing D1/D2 Plasticity

摘要

著录项

相似文献

相关主题

期刊订阅