A framework for recognizing the simultaneous aspects of American sign language

被引:171
作者
Vogler, C [1 ]
Metaxas, D [1 ]
机构
[1] Univ Penn, Dept Comp & Informat Sci, Vis Anal & Simulat Technol Lab, Philadelphia, PA 19104 USA
关键词
sign language recognition; gesture recognition; human motion modeling; hidden Markov models;
D O I
10.1006/cviu.2000.0895
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The major challenge that faces American Sign Language (ASL) recognition now is developing methods that will scale well with increasing vocabulary size. Unlike in spoken languages, phonemes can occur simultaneously in ASL. The number of possible combinations of phonemes is approximately 1.5 x 10(9), which cannot be tackled by conventional hidden Markov model-based methods. Gesture recognition, which is less constrained than ASL recognition. suffers from the same problem. In this paper we present a novel framework to ASL recognition that aspires to being a solution to the scalability problems. It is based on breaking down the signs into their phonemes and modeling them with parallel hidden Markov models. These model the simultaneous aspects of ASL independently. Thus, they can be trained independently, and do not require consideration of the different combinations at training time. We show in experiments with a 22-sign-vocabulary how to apply this framework in practice. We also show that parallel hidden Markov models outperform conventional hidden Markov models. (C) 2001 Academic Press.
引用
收藏
页码:358 / 384
页数:27
相关论文
共 37 条
  • [1] [Anonymous], 1995, International Workshop on Automatic Face and Gesture Recognition
  • [2] [Anonymous], ACM S VIRT REAL SOFT
  • [3] BOURLARD H, 1997, P ICASSP
  • [4] Braffort A., 1997, Progress in Gestural Interaction. Proceedings of Gesture Workshop '96, P17
  • [5] Brand M, 1997, P IEEE C COMP VIS PA
  • [6] Brentari D., 1995, HDB PHONOLOGICAL THE, P615
  • [7] COULTER G, 1993, CURRENT ISSUES ASL P, V3
  • [8] ERENSHTEYN R, 1996, P WORKSH INT GEST LA
  • [9] Factorial hidden Markov models
    Ghahramani, Z
    Jordan, MI
    [J]. MACHINE LEARNING, 1997, 29 (2-3) : 245 - 273
  • [10] GIBET S, 1998, GESTURE SIGN LANGUAG