Principal component analysis of speech spectrogram images

被引：33

作者：

Pinkowski, B

机构：

[1] Computer Science Department, Western Michigan University, Kalamazoo

来源：

PATTERN RECOGNITION | 1997年 / 30卷 / 05期

基金：

美国国家卫生研究院;

关键词：

principal components; Karhunen-Loeve transform; Fourier descriptors; cluster analysis; speech spectrogram;

D O I：

10.1016/S0031-3203(96)00103-3

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent research has demonstrated that spectrograms containing human speech utterances can be analyzed using image processing techniques to yield a high recognition rate. In particular, Fourier descriptors (FDs) have been proved very useful for characterizing the boundary of segmented isolated words containing the English semivowels /w/, /y/, /l/, and /r/. This study examines the appropriateness of FDs combined with 17 other general features for classifying objects contained in binary spectrogram images. Principal components (PCs) are used for feature reduction on a speaker-dependent data set consisting of 80 sounds representing 20 speaker-dependent words containing English semivowels. With only eight features, including four 32-point FDs and four general features obtained from principal component analysis, a 97.5% recognition rate was obtained. (C) 1997 Pattern Recognition Society.

引用

页码：777 / 787

页数：11

共 34 条

[1] Text page recognition using grey-level features and hidden markov models
Aas, K
Eikvil, L
[J]. PATTERN RECOGNITION, 1996, 29 (06) : 977 - 985
[2] [Anonymous], [No title captured]
[3] 2-DIMENSIONAL OBJECT RECOGNITION USING A 2-DIMENSIONAL POLAR TRANSFORM
BLUMENKRANS, A
[J]. PATTERN RECOGNITION, 1991, 24 (09) : 879 - 890
[4] ON THE CLASSIFICATION OF IMAGE REGIONS BY COLOR, TEXTURE AND SHAPE
CAELLI, T
REYE, D
[J]. PATTERN RECOGNITION, 1993, 26 (04) : 461 - 470
[5] CASTLEMAN K. R., 1996, Digital image processing
[6] SCREE TEST FOR NUMBER OF FACTORS
CATTELL, RB
[J]. MULTIVARIATE BEHAVIORAL RESEARCH, 1966, 1 (02) : 245 - 276
[7] CHANG WC, 1983, APPL STAT-J ROY ST C, V32, P267
[8] Cooke M. P., 1988, Recent Advances in Speech Understanding and Dialog Systems. Proceedings of the NATO Advanced Institute, P129
[9] A COMPLETE SET OF FOURIER DESCRIPTORS FOR TWO-DIMENSIONAL SHAPES
CRIMMINS, TR
[J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1982, 12 (06): : 848 - 855
[10] AN EFFICIENT ALGORITHM FOR COMPUTATION OF SHAPE MOMENTS FROM RUN-LENGTH CODES OR CHAIN CODES
DAI, M
BAYLOU, P
NAJIM, M
[J]. PATTERN RECOGNITION, 1992, 25 (10) : 1119 - 1128

← 1 2 3 4 →