Updated MINDS Report on Speech Recognition and Understanding, Part 2

被引:28
作者
Baker, Janet M. [1 ]
Deng, Li [2 ,3 ]
Khudanpur, Sanjeev [4 ]
Lee, Chin-Hui [5 ,6 ]
Glass, James R. [7 ]
Morgan, Nelson [8 ,9 ]
O'Shaughnessy, Douglas [10 ]
机构
[1] Saras Inst, W Newton, MA USA
[2] Univ Washington, Seattle, WA 98195 USA
[3] Microsoft Res, Redmond, WA USA
[4] Johns Hopkins Univ, GWC Whiting Sch Engn, Baltimore, MD USA
[5] Georgia Inst Technol, Sch ECE, Atlanta, GA 30332 USA
[6] Bell Labs, Murray Hill, NJ 07974 USA
[7] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA
[8] Univ Calif Berkeley, ICSI, Res Lab, Berkeley, CA 94720 USA
[9] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA
[10] Univ Quebec, INRS EMT, Ste Foy, PQ G1V 2M3, Canada
关键词
BRAIN ACTIVITY; CONSTRAINTS; MODELS; WORDS;
D O I
10.1109/MSP.2009.932707
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The second part of the updated version of "MINDS 2006-2007 Report of the Speech Understanding Working Group" is presented which came from two workshops entitled "Meeting of the MINDS: Future Directions for Human Language Technology". The specific topics being discussed include: the fundamental science of human speech perception and production; transcription to meaning extraction; understanding the cortical speech/language processing; the heterogeneous knowledge sources for automatic speech recognition; the information-bearing elements of the speech signal; the novel computational architectures for knowledge-rich speech recognition; the adaptation and self-learning in speech recognition systems; the robustness and context-awareness in acoustic models for speech recognition; the speaker's acoustic environment and the speech acquisition channel; the speaker characteristics and style; the language characteristics; robust speech recognition in everyday environments; and finally, the novel search procedures for knowledge-rich speech recognition.
引用
收藏
页码:78 / 85
页数:8
相关论文
共 68 条
[21]  
De Wachter M., 2003, 8th European conference on speech communication and technology- Eurospeech 2003, P1133
[22]  
Deng L., 2003, Speech processing: a dynamic and optimizationoriented approach
[23]   Structured speech modeling [J].
Deng, Li ;
Yu, Dong ;
Acero, Alex .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (05) :1492-1504
[24]   Speech Recognition Using Hidden Markov Models with Polynomial Regression Functions as Nonstationary States [J].
Deng, Li ;
Aksmanovic, Mike ;
Sun, Xiaodong ;
Wu, C. F. Jeff .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (04) :507-520
[25]  
FARRELL K, 1992, P IEEE INT C AC SPEE, P285
[26]  
Fillmore C.J., 2002, LREC
[27]  
FRANKEL J, 2001, P EUR DENM, P599, DOI DOI 10.1109/TSA.2005.851910
[28]   Speech recognition using linear dynamic models [J].
Frankel, Joe ;
King, Simon .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (01) :246-256
[29]  
GAUVAIN JL, 1997, IEEE T SPEECH AUDIO, P711
[30]   MAPPING FUNCTION IN THE HUMAN BRAIN WITH MAGNETOENCEPHALOGRAPHY, ANATOMICAL MAGNETIC-RESONANCE-IMAGING, AND FUNCTIONAL MAGNETIC-RESONANCE-IMAGING [J].
GEORGE, JS ;
AINE, CJ ;
MOSHER, JC ;
SCHMIDT, DM ;
RANKEN, DM ;
SCHLITT, HA ;
WOOD, CC ;
LEWINE, JD ;
SANDERS, JA ;
BELLIVEAU, JW .
JOURNAL OF CLINICAL NEUROPHYSIOLOGY, 1995, 12 (05) :406-431