Pitch-based feature extraction for audio classification

被引:10
作者
Abu-El-Quran, AR [1 ]
Goubran, RA [1 ]
机构
[1] Carleton Univ, Dept Syst & Comp Engn, Ottawa, ON K1S 5B6, Canada
来源
2ND IEEE INTERNATIONAL WORKSHOP ON HAPTIC, AUDIO AND VISUAL ENVIRONMENTS AND THEIR APPLICATIONS - HAVE 2003 | 2003年
关键词
D O I
10.1109/HAVE.2003.1244723
中图分类号
TP18 [人工智能理论];
学科分类号
081104 [模式识别与智能系统]; 0812 [计算机科学与技术]; 0835 [软件工程]; 1405 [智能科学与技术];
摘要
This paper proposes a new algorithm to discriminate between speech and non-speech audio segments. It is intended for security, applications as well as talker location identification in audio conferencing systems, equipped,with microphone arrays. The proposed method is based on splitting the audio segment into small frames and detecting the presence of pitch in each one of them. The ratio of frames with pitch detected to the total number of frames is defined as the pitch ratio and is used as the main feature to classify speech and non-speech segments. The performance of the proposed method is evaluated using a library of audio segments containing female and male speech, and non-speech segments such as computer fan noise, cock-tail noise, footsteps, and traffic noise. It is shown that the proposed algorithm can achieve correct decision of 97% for the speech and 98% for nonspeech segments, 0.5-seconds long.
引用
收藏
页码:43 / 47
页数:5
相关论文
共 11 条
[1]
Deller J., 2000, Discrete-Time Processing of Speech Signals
[2]
HUANG J, 1998, P INSTRUMENTATION ME, V1, P330
[3]
KASHINO K, 2000, J ACOUSTICS SOC JAPA, P217
[4]
LU G, 2000, P INT C SIGN PROC BE, V2, P776
[5]
Lu GJ, 1998, ICSP '98: 1998 FOURTH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, P1142, DOI 10.1109/ICOSP.1998.770818
[6]
MARKS JA, 1988, P COMMUNICATION JUN, P1
[7]
OHWOOKKWON, 2003, P INT C AC SPEECH SI, V1, pI436
[8]
COMPARATIVE PERFORMANCE STUDY OF SEVERAL PITCH DETECTION ALGORITHMS [J].
RABINER, LR ;
CHENG, MJ ;
ROSENBERG, AE ;
MCGONEGAL, CA .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1976, 24 (05) :399-418
[9]
Shin WH, 2000, INT CONF ACOUST SPEE, P1399, DOI 10.1109/ICASSP.2000.861845
[10]
Wojtaszek D, 2002, HAVE 2002 - IEEE INTERNATIONAL WORKSHOP ON HAPTIC VIRTUAL ENVIRONMENTS AND THEIR APPLICATIONS, P91, DOI 10.1109/HAVE.2002.1106920