PITCH DETECTION WITH A NEURAL-NET CLASSIFIER

被引:25
作者
BARNARD, E
COLE, RA
VEA, MP
ALLEVA, FA
机构
[1] OREGON GRAD INST,DEPT COMP SCI & ENGN,BEAVERTON,OR 97006
[2] CARNEGIE MELLON UNIV,DEPT COMP SCI,PITTSBURGH,PA 15213
关键词
Neural Networks;
D O I
10.1109/78.80812
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Pitch detection based on neural-net classifiers is investigated. To this end, the extent of generalization attainable with neural nets is first examined, and it is shown that a suitable choice of features is required to utilize this property. Specifically, invariant features should be used whenever possible. For pitch detection, two feature sets, one based on waveform samples and the other based on properties of waveform peaks, are introduced. Experiments with neural classifiers demonstrate that the latter feature set, which has better invariance properties, performs more successfully. It is found that the best neural-net pitch tracker approaches the level of agreement of human labelers on the same data set, and performs competitively in comparison to a sophisticated feature-based tracker. An analysis of the errors committed by the neural net (relative to the hand labels used for training) reveals that they are mostly due to inconsistent hand labeling of ambiguous waveform peaks.
引用
收藏
页码:298 / 307
页数:10
相关论文
共 15 条
  • [1] BARNARD E, UNPUB IEEE T PATT AN
  • [2] BARNARD E, UNPUB IEEE T SYST MA
  • [3] Duda R. O., 1973, PATTERN CLASSIFICATI, V3
  • [4] FISHER W, 1986, FEB P DARPA SPEECH R, P93
  • [5] HORNIK K, 1989, 8845 U CAL DEP EC DI
  • [6] LAHAT M, 1987, IEEE T ACOUST SPEECH, V35
  • [7] LAMEL L, 1986, FEB P DARPA SPEECH R, P100
  • [8] Review of Neural Networks for Speech Recognition
    Lippmann, Richard P.
    [J]. NEURAL COMPUTATION, 1989, 1 (01) : 1 - 38
  • [9] Philips M. S., 1985, J ACOUST SOC AM, V77, pS9
  • [10] POMERLEAU GL, 1988, JUL P IEEE INT C NEU, P165