A sawtooth waveform inspired pitch estimator for speech and music

被引:240
作者
Camacho, Arturo [1 ]
Harris, John G. [1 ]
机构
[1] Univ Florida, Computat NeuroEngn Lab, Gainesville, FL 32611 USA
关键词
D O I
10.1121/1.2951592
中图分类号
O42 [声学];
学科分类号
070206 [声学]; 082403 [水声工程];
摘要
A sawtooth waveform inspired pitch estimator (SWIPE) has been developed for speech and music. SWIPE estimates the pitch as the fundamental frequency of the sawtooth waveform whose spectrum best matches the spectrum of the input signal. The comparison of the spectra is done by computing a normalized inner product between the spectrum of the signal and a modified cosine. The size of the analysis window is chosen appropriately to make the width of the main lobes of the spectrum match the width of the positive lobes of the cosine. SWIPE', a variation of SWIPE, utilizes only the first and prime harmonics of the signal, which significantly reduces subharmonic errors commonly found in other pitch estimation algorithms. The authors' tests indicate that SWIPE and SWIPE' performed better on two spoken speech and one disordered voice database and one musical instrument database consisting of single notes performed at a variety of pitches. (C) 2008 Acoustical Society of America.
引用
收藏
页码:1638 / 1652
页数:15
相关论文
共 35 条
[1]
American Standards Association, 1960, AC TERM SI 1 1960
[2]
[Anonymous], 2002, J ACOUST SOC AM, DOI DOI 10.1121/1.1458024
[3]
[Anonymous], 1983, PITCH DETERMINATION, DOI DOI 10.1007/978-3-642-81926-1
[4]
Bagshaw P. C., 1994, THESIS U EDINBURGH E
[5]
BAGSHAW PC, 1993, P EUR C SPEECH COMM, P1003
[6]
Boersma P., 1993, Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound, P97, DOI DOI 10.1371/JOURNAL.PONE.0069107
[7]
A pitch estimation algorithm based on the smooth harmonic average peak-to-valley envelope [J].
Camacho, Arturo ;
Harris, John G. .
2007 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, 2007, :3940-3943
[8]
The MUSART testbed for Query-by-Humming evaluation [J].
Dannenberg, RB ;
Birmingham, WP ;
Tzanetakis, G ;
Meek, C ;
Hu, N ;
Pardo, B .
COMPUTER MUSIC JOURNAL, 2004, 28 (02) :34-48
[10]
DIMARTINO J, 1999, P 6 EUR C SPEECH COM, P2773