Analysis of speech segment duration with the lognormal distribution: A basis for unification and comparison

被引:23
作者
Rosen, KM [1 ]
机构
[1] Univ Wisconsin, Waisman Ctr, Madison, WI 53705 USA
关键词
D O I
10.1016/j.wocn.2005.02.001
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
This study re-examines published data with the lognormal distribution (LND) and presents a basis for the unification of many previous measurements of speech segment duration in connected speech. The application of the LND was motivated by the connection between previous speech models and the law of proportionate effects, which is known to generate LNDs. Distributions of speech segment length in previous studies [Psycholinguistics: Experiments in Spontaneous Speech, 1968; Language and Speech 25 (1982) 11-28; Journal of the Acoustical Society of America 72 (1982) 705-716; Journal of the Acoustical Society of America 83 (1988a) 1553-1573; Journal of the Acoustical Society of America 83 (1988b) 1574-1585; Speech Communication 19 (1996) 161-176] were re-plotted onto lognormal cumulative plots. With the exceptions of stressed consonants and the phoneme /f/, the data were consistent with the LND, based on the results of the Kolmogorov-Smirnov test and root mean square error of the least-squares fit. Aside from the exceptions, the results indicate that (1) the duration of pauses, vowels and consonant classes can be effectively modeled with two parameters (geometric mean and geometric standard deviation), and (2) linguistic and non-linguistic effects are proportionate to duration and combine multiplicatively. Analysis with the LND revealed specific characteristics in some of the distributions that were not observed in the original analysis with linear-scaled distributions. Examples of how the LND may be used to detect heterogeneous groups in data sets, to determine outliers, and to reveal differences in underlying processes (e.g., existence of incompressible portions) are given. Advantages of using LND parameters (i.e., geometric mean, geometric standard deviation) over linear parameters (e.g., coefficient of variation) are also discussed. (c) 2005 Elsevier Ltd. All rights reserved.
引用
收藏
页码:411 / 426
页数:16
相关论文
共 50 条
[1]   SPEAKING RATE AND SPEECH MOVEMENT VELOCITY PROFILES [J].
ADAMS, SG ;
WEISMER, G ;
KENT, RD .
JOURNAL OF SPEECH AND HEARING RESEARCH, 1993, 36 (01) :41-54
[2]  
Aitchison J., 1957, The lognormal distribution with special reference to its uses in economics
[3]   Stochastic modeling of earthquake occurrences and estimation of seismic hazard:: a random field approach [J].
Akkaya, AD ;
Yücemen, MS .
PROBABILISTIC ENGINEERING MECHANICS, 2002, 17 (01) :1-13
[4]  
BRIGO D, 2001, FINANCE STOCK, V4, P147
[5]  
Campbell Nick., 1992, Bailly et alii, P211
[6]   SEGMENT DURATIONS IN A SYLLABLE FRAME [J].
CAMPBELL, WN ;
ISARD, SD .
JOURNAL OF PHONETICS, 1991, 19 (01) :37-47
[7]  
Canavos G.c., 1984, Applied Probability and Statistical Methods
[8]  
COKER CH, 1968, SPEECH SYNTHESIS, P135
[9]  
Crow E. L., 1988, Lognormal Distributions: Theory and Applications
[10]   SEGMENTAL DURATIONS IN CONNECTED SPEECH SIGNALS - PRELIMINARY-RESULTS [J].
CRYSTAL, TH ;
HOUSE, AS .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1982, 72 (03) :705-716