The ascending neuromodulatory systems in learning by reinforcement: Comparing computational conjectures with experimental findings

被引：50

作者：

Pennartz, CMA ^{[1
]}

机构：

[1] CALTECH, PASADENA, CA 91125 USA

来源：

BRAIN RESEARCH REVIEWS | 1995年 / 21卷 / 03期

关键词：

acetylcholine; dopamine; long-term potentiation; memory; noradrenaline; supervised learning; synaptic plasticity; temporal difference learning;

D O I：

10.1016/0165-0173(95)00014-3

中图分类号：

Q189 [神经科学];

学科分类号：

071006 ;

摘要：

A central problem in cognitive neuroscience is how animals can manage to rapidly master complex sensorimotor tasks when the only sensory feedback they use to improve their performance is a simple reinforcing stimulus. Neural network theorists have constructed algorithms for reinforcement learning that can be used to solve a variety of biological problems and do not violate basic neurophysiological principles, in contrast to the back-propagation algorithm. A key assumption in these models is the existence of a reinforcement signal, which would be diffusively broadcast throughout one or several brain areas engaged in learning. This signal is further assumed to mediate up- and downward changes in synaptic efficacy by acting as a multiplicative factor in learning rules. The biological plausibility of these algorithms has been defended by the conjecture that the neuromodulators noradrenaline, acetylcholine or dopamine may form the neurochemical substrate of reinforcement signals. In this commentary, the predictions raised by this hypothesis are compared to anatomical, electrophysiological and behavioural findings. The experimental evidence does not support, and often argues against, a general reinforcement-encoding function of these neuromodulatory systems. Nevertheless, the broader concept of evaluative signalling between brain structures implied in learning appears to be reasonable and the available algorithms may open new avenues for constructing more realistic network architectures.

引用

页码：219 / 245

页数：27

共 235 条

[91] NEURONS WITH GRADED RESPONSE HAVE COLLECTIVE COMPUTATIONAL PROPERTIES LIKE THOSE OF 2-STATE NEURONS [J].