Robust sound classification through the representation of similarity using response fields derived from stimuli during early experience

被引:9
作者
Coath, M [1 ]
Denham, SL [1 ]
机构
[1] Univ Plymouth, Ctr Theoret & Computat Neurosci, Plymouth PL4 8AA, Devon, England
关键词
D O I
10.1007/s00422-005-0560-4
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Models of auditory processing, particularly of speech, face many difficulties. Included in these are variability among speakers, variability in speech rate, and robustness to moderate distortions such as time compression. We constructed a system based on ensembles of feature detectors derived from fragments of an onset-sensitive sound representation. This method is based on the idea of 'spectro-temporal response fields' and uses convolution to measure the degree of similarity through time between the feature detectors and the stimulus. The output from the ensemble was used to derive segmentation cues and patterns of response, which were used to train an artificial neural network (ANN) classifier. This allowed us to estimate a lower bound for the mutual information between the class of the input and the class of the output. Our results suggest that there is significant information in the output of our system, and that this is robust with respect to the exact choice of feature set, time compression in the stimulus, and speaker variation. In addition, the robustness to time compression in the stimulus has features in common with human psychophysics. Similar experiments using feature detectors derived from fragments of non-speech sounds performed less well. This result is interesting in the light of results showing aberrant cortical development in animals exposed to impoverished auditory environments during the developmental phase.
引用
收藏
页码:22 / 30
页数:9
相关论文
共 35 条
[1]   A COMPARISON OF THE SPECTRO-TEMPORAL SENSITIVITY OF AUDITORY NEURONS TO TONAL AND NATURAL STIMULI [J].
AERTSEN, AMHJ ;
JOHANNESMA, PIM .
BIOLOGICAL CYBERNETICS, 1981, 42 (02) :145-156
[2]   Speech comprehension is correlated with temporal response patterns recorded from auditory cortex [J].
Ahissar, E ;
Nagarajan, S ;
Ahissar, M ;
Protopapas, A ;
Mahncke, H ;
Merzenich, MM .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (23) :13367-13372
[3]  
Arai T, 1998, INT CONF ACOUST SPEE, P933, DOI 10.1109/ICASSP.1998.675419
[4]  
Bar-Yosef O, 2002, J NEUROSCI, V22, P8619
[5]   Spectral integration of synchronous and asynchronous cues to consonant identification [J].
Buss, E ;
Hall, JW ;
Grose, JH .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2004, 115 (05) :2278-2285
[6]   Environmental noise retards auditory cortical development [J].
Chang, EF ;
Merzenich, MM .
SCIENCE, 2003, 300 (5618) :498-502
[7]  
DAYAN A, 2001, NEURAL CODING
[8]   Optimizing sound features for cortical neurons [J].
deCharms, RC ;
Blake, DT ;
Merzenich, MM .
SCIENCE, 1998, 280 (5368) :1439-1443
[9]   Spectro-temporal response field characterization with dynamic ripples in ferret primary auditory cortex [J].
Depireux, DA ;
Simon, JZ ;
Klein, DJ ;
Shamma, SA .
JOURNAL OF NEUROPHYSIOLOGY, 2001, 85 (03) :1220-1234
[10]   Representation is representation of similarities [J].
Edelman, S .
BEHAVIORAL AND BRAIN SCIENCES, 1998, 21 (04) :449-+