Near-videorealistic synthetic talking faces: implementation and evaluation

被引:23
作者
Theobald, BJ
Bangham, JA
Matthews, IA
Cawley, GC
机构
[1] Univ E Anglia, Sch Comp Sci, Norwich NR4 7TJ, Norfolk, England
[2] Carnegie Mellon, Inst Robot, Pittsburgh, PA 15123 USA
关键词
talking faces; shape and appearance models; avatars; dynamic textures;
D O I
10.1016/j.specom.2004.07.002
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The application of two-dimensional (2D) shape and appearance models to the problem of creating realistic synthetic talking faces is presented. A sample-based approach is adopted, where the face of a talker articulating a series of phonetically balanced training sentences is mapped to a trajectory in a low-dimensional model-space that has been learnt from the training data. Segments extracted from this trajectory corresponding to the synthesis units (e.g. triphones) are temporally normalised, blended, concatenated and smoothed to form a new trajectory, which is mapped back to the image domain to provide a natural, realistic sequence corresponding to the desired (arbitrary) utterance. The system has undergone early subjective evaluation to determine the naturalness of this synthesis approach. Described are tests to determine the suitability of the parameter smoothing method used to remove discontinuities introduced during synthesis at the concatenation boundaries, and tests used to determine how well long term coarticulation effects are reproduced during synthesis using the adopted unit selection scheme. The system has been extended to animate the face of a 3D virtual character (avatar) and this is also described. (C) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:127 / 140
页数:14
相关论文
共 27 条
[1]  
[Anonymous], 1998, P 25 ANN C COMP GRAP
[2]  
Arslan L., 1998, P AUD VIS SPEECH PRO, P175
[3]   Audiovisual Speech Synthesis [J].
G. Bailly ;
M. Bérar ;
F. Elisei ;
M. Odisio .
International Journal of Speech Technology, 2003, 6 (4) :331-346
[4]  
BAILLY G, 1992, TALKING MACHINES THE
[5]  
Baker S, 2001, PROC CVPR IEEE, P1090
[6]  
BARTELS R. H., 1987, INTRO SPLINES USE CO
[7]  
BENOIT C, 1992, TALKING MACHINES THE, P435
[8]  
BLACK A, 1997, HCRCR83 U ED
[9]  
Brand M, 1999, COMP GRAPH, P21, DOI 10.1145/311535.311537
[10]  
Bregler C., 1997, Computer Graphics Proceedings, SIGGRAPH 97, P353, DOI 10.1145/258734.258880