M = Syntax plus Prosody: A syntactic-prosodic labelling scheme for large spontaneous speech databases

被引:22
作者
Batliner, A
Kompe, R
Kiessling, A
Mast, M
Niemann, H
Noth, E
机构
[1] Univ Erlangen Nurnberg, Lehrstuhl Mustererkennung, D-91058 Erlangen, Germany
[2] Sony Stuttgart Technol Ctr, D-70736 Fellbach, Germany
[3] Ericsson Eurolab Deutschland GMBH, D-90411 Nurnberg, Germany
[4] IBM Heidelberg Sci Ctr, D-69115 Heidelberg, Germany
关键词
syntax; dialogue; prosody; phrase boundaries; prosodic labelling; automatic boundary classification; spontaneous speech; large databases; neural networks; stochastic language models;
D O I
10.1016/S0167-6393(98)00037-5
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In automatic speech understanding, division of continuous running speech into syntactic chunks is a great problem. Syntactic boundaries are often marked by prosodic means. For the training of statistical models for prosodic boundaries large databases are necessary. For the German Verbmobil (VM) project (automatic speech-to-speech translation), we developed a syntactic-prosodic labelling scheme where different types of syntactic boundaries are labelled for a large spontaneous speech corpus. This labelling scheme is presented and compared with other labelling schemes for perceptual-prosodic, syntactic, and dialogue act boundaries. Interlabeller consistencies and estimation of effort needed are discussed. We compare the results of classifiers (multi-layer perceptrons (MLPs) and n-gram language models) trained on these syntactic-prosodic boundary labels with classifiers trained on perceptual-prosodic and pure syntactic labels. The main advantage of the rough syntactic-prosodic labels presented in this paper is that large amounts of data can be labelled with relatively little effort. The classifiers trained with these labels turned out to be superior with respect to purely prosodic or syntactic labelling schemes, yielding recognition rates of up to 96% for the two-class-problem 'boundary versus no boundary'. The use of boundary information leads to a marked improvement in the syntactic processing of the VM system. (C) 1998 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:193 / 222
页数:30
相关论文
共 55 条
[1]  
[Anonymous], P EUR
[2]  
BATLINER A, 1996, 102 VERBM
[3]  
BATLINER A, 1996, P INT C COMP LING CO, V1, P71
[4]  
BATLINER A, 1995, NATO ASI SERIES F, V147, P325
[5]  
BATLINER A, 1997, P ESCA WORKSH INT DE, P39
[6]  
BATLINER A, 1997, 124 VERBM
[7]  
BEAR J, 1990, P 28 C ASS COMP LING, P17
[8]  
BECKMAN M, 1994, GUIDELINES TOBI TRAN
[9]  
BLOCK H, 1997, P INT C AC SPEECH SI, V79, P1
[10]  
BLOCK H, 1992, P INT C COMP LING NA, V1, P87