Extraction of 2D motion trajectories and its application to hand gesture recognition

被引:206
作者
Yang, MH
Ahuja, N
Tabb, M
机构
[1] Honda Fundamental Res Labs, Mountain View, CA 94041 USA
[2] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
[3] Univ Illinois, Beckman Inst, Urbana, IL 61801 USA
[4] Vexcel Corp, Boulder, CO 80301 USA
关键词
motion segmentation; motion analysis; motion trajectory; American Sign Language; hand gesture recognition; time-delay neural network;
D O I
10.1109/TPAMI.2002.1023803
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an algorithm for extracting and classifying two-dimensional motion in an image sequence based on motion trajectories. First, a multiscale segmentation is performed to generate homogeneous regions in each frame. Regions between consecutive frames are then matched to obtain two-view correspondences. Affine transformations are computed from each pair of corresponding regions to define pixel matches. Pixels matches over consecutive image pairs are concatenated to obtain pixel-level motion trajectories across the image sequence. Motion patterns are learned from the extracted trajectories using a time-delay neural network. We apply the proposed method to recognize 40 hand gestures of American Sign Language. Experimental results show that motion patterns of hand gestures can be extracted and recognized accurately using motion trajectories.
引用
收藏
页码:1061 / 1074
页数:14
相关论文
共 47 条
[11]   A learning-based prediction-and-verification segmentation scheme for hand sign image sequence [J].
Cui, YT ;
Weng, JY .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1999, 21 (08) :798-804
[12]  
Darrell T., 1993, Proceedings. 1993 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.93CH3309-2), P335, DOI 10.1109/CVPR.1993.341109
[13]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[14]   Glove-TalkII - A neural-network interface which maps gestures to parallel formant speech synthesizer controls [J].
Fels, SS ;
Hinton, GE .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1998, 9 (01) :205-212
[15]   PRINCIPAL CURVES [J].
HASTIE, T ;
STUETZLE, W .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1989, 84 (406) :502-516
[16]  
Haynes S., 1980, COMPUTER VISION GRAP, V21, P345
[17]   MULTIMODAL ESTIMATION OF DISCONTINUOUS OPTICAL-FLOW USING MARKOV RANDOM-FIELDS [J].
HEITZ, F ;
BOUTHEMY, P .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1993, 15 (12) :1217-1232
[18]  
Hopcroft John E., 1972, IBM RES S SERIES, P131, DOI DOI 10.1007/978-1-4684-2001-2_13
[19]   DETERMINING OPTICAL-FLOW [J].
HORN, BKP ;
SCHUNCK, BG .
ARTIFICIAL INTELLIGENCE, 1981, 17 (1-3) :185-203
[20]   CONDENSATION - Conditional density propagation for visual tracking [J].
Isard, M ;
Blake, A .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 1998, 29 (01) :5-28