Comparing human and automatic face recognition performance

被引:22
作者
Adler, Andy [1 ]
Schuckers, Michael E.
机构
[1] Carleton Univ, Ottawa, ON K1S 5B6, Canada
[2] St Lawrence Univ, Math Comp Sci & Stat Dept, Canton, NY 13617 USA
[3] W Virginia Univ, Ctr Identificat Technol Res, Morgantown, WV 26506 USA
来源
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS | 2007年 / 37卷 / 05期
基金
加拿大自然科学与工程研究理事会; 美国国家科学基金会;
关键词
biometrics; detection error tradeoff; face recognition; performance analysis;
D O I
10.1109/TSMCB.2007.907036
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Face recognition technologies have seen dramatic improvements in performance over the past decade, and such systems are now widely used for security and commercial applications. Since recognizing faces is a task that humans are understood to be very good at, it is common to want to compare automatic face recognition (AFR) and human face recognition (HFR) in terms of biometric performance. This paper addresses this question by: 1) conducting verification tests on volunteers (HFR) and commercial AFR systems and 2) developing statistical methods to support comparison of the performance of different biometric systems. HFR was tested by presenting face-image pairs and asking subjects to classify them on a scale of "Same," "Probably Same," "Not sure," "Probably Different," and "Different"; the same image pairs were presented to AFR systems, and the biometric match score was measured. To evaluate these results, two new statistical evaluation techniques are developed. The first is a new way to normalize match-score distributions, where a normalized match score (t) over cap is calculated as a function of the angle from a representation of [false match rate, false nonmatch rate] values in polar coordinates from some center. Using this normalization, we develop a second methodology to calculate an average detection error tradeoff (DET) curve and show that this method is equivalent to direct averaging of DET data along each angle from the center. This procedure is then applied to compare the performance of the best AFR algorithms available to us in the years 1999, 2001, 2003, 2005, and 2006, in comparison to human scores. Results show that algorithms have dramatically improved in performance over that time. In comparison to the performance of the best AFR system of 2006, 29.2% of human subjects performed better, while 37.5% performed worse.
引用
收藏
页码:1248 / 1255
页数:8
相关论文
共 35 条
[1]  
ADLER A, 2004, P BIOM CONS C
[2]  
[Anonymous], FRVT 2000 EVALUATION
[3]  
[Anonymous], 2000, DYNAMIC VISION IMAGE
[4]  
[Anonymous], [No title captured]
[5]  
[Anonymous], 2003, HP INVENT
[6]  
[Anonymous], 2004, P 1 WORKSH ROC AN AI
[7]   The use of the area under the roc curve in the evaluation of machine learning algorithms [J].
Bradley, AP .
PATTERN RECOGNITION, 1997, 30 (07) :1145-1159
[8]   Verification of face identities from images captured on video [J].
Bruce, V ;
Henderson, Z ;
Greenwood, K ;
Hancock, PJB ;
Burton, AM ;
Miller, P .
JOURNAL OF EXPERIMENTAL PSYCHOLOGY-APPLIED, 1999, 5 (04) :339-360
[9]   Human and automatic face recognition: a comparison across image formats [J].
Burton, AM ;
Miller, P ;
Bruce, V ;
Hancock, PJB ;
Henderson, Z .
VISION RESEARCH, 2001, 41 (24) :3185-3195
[10]  
FURL N, 2002, COGNITIVE SCI, V96, P1, DOI DOI 10.1207/S15516709C0G2606_