A comparison of speaker identification results using features based on cepstrum and Fourier-Bessel expansion

被引：32

作者：

Gopalan, K ^{[1
]}

Anderson, TR ^{[1
]}

Cupples, EJ ^{[1
]}

机构：

[1] Purdue Univ, Hammond, IN 46323 USA

来源：

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 1999年 / 7卷 / 03期

关键词：

Bessel functions; cepstral features; Fourier-Bessel expansion; speaker identification;

D O I：

10.1109/89.759036

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

A compact representation of speech is possible using Bessel functions because of the similarity between voiced speech and the Bessel functions, Both voiced speech and the Bessel functions exhibit quasiperiodicity and decaying amplitude with time. This paper presents the results of speaker identification experiments using features obtained from 1) the Fourier-Bessel expansion and 2) the cepstral representation of speech frames. Identification scores of 65% and 76% were achieved using features based on J(1)(t) expansion of air-to-ground speech transmission databases of 143 and 1054 test utterances, respectively. The corresponding scores for the two databases using cepstral coefficients. of a comparable size were 80% and 88%, A comparison of the two sets of features indicates that J(1)(t) can be used to model the hearing perception much like the mel cepstral coefficients.

引用

页码：289 / 294

页数：6

共 10 条

[1]

CHEN CS, 1985, P ICASSP TAMP FL MAR, P497

[2] COMPARISON OF PARAMETRIC REPRESENTATIONS FOR MONOSYLLABIC WORD RECOGNITION IN CONTINUOUSLY SPOKEN SENTENCES [J].

DAVIS, SB ;

MERMELSTEIN, P .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (04) :357-366

[3]

Deller Jr J. R., 1993, DISCRETE TIME PROCES

[4] CHOICE OF BASE SIGNALS IN SPEECH SIGNAL ANALYSIS [J].

DOLANSKY, L .

IRE TRANSACTIONS ON AUDIO, 1960, 8 (06) :221-229

[5]

*ENTR RES LAB, 1996, ESPS PROGR

[6]

GOPALAN K, 1993, SPEAKER IDENTIFICATI

[7]

Gopalan K., 1994, PROC S INTELLIGENT S, P255

[8]

GOPALAN K, 1996, SPEAKER IDENTIFICATI

[9] ANALYSIS-SYNTHESIS OF CONNECTED SPEECH IN TERMS OF ORTHOGONALIZED EXPONENTIALLY DAMPED SINUSOIDS [J].

MANLEY, HJ .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1963, 35 (04) :464-&

[10]

Sneddon I. N., 1951, FOURIER TRANSFORM

← 1 →