reCAPTCHA: Human-based character recognition via web security measures

被引:555
作者
von Ahn, Luis [1 ]
Maurer, Benjamin [1 ]
McMillen, Colin [1 ]
Abraham, David [1 ]
Blum, Manuel [1 ]
机构
[1] Carnegie Mellon Univ, Dept Comp Sci, Pittsburgh, PA 15213 USA
关键词
D O I
10.1126/science.1160379
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
CAPTCHAs ( Completely Automated Public Turing test to tell Computers and Humans Apart) are widespread security measures on the World Wide Web that prevent automated programs from abusing online services. They do so by asking humans to perform a task that computers cannot yet perform, such as deciphering distorted characters. Our research explored whether such human effort can be channeled into a useful purpose: helping to digitize old printed material by asking users to decipher scanned words from books that computerized optical character recognition failed to recognize. We showed that this method can transcribe text with a word accuracy exceeding 99%, matching the guarantee of professional human transcribers. Our apparatus is deployed in more than 40,000 Web sites and has transcribed over 440 million words.
引用
收藏
页码:1465 / 1468
页数:4
相关论文
共 14 条
[1]  
*ADV RES INC, DAT COLL SERV QUAL C
[2]  
CHELLAPILLA K, 2005, P SIGCHI C HUM FACT, P711
[3]  
CHELLAPILLA K, 2005, ADV NEURAL INFORM PR, V17, P265
[4]  
Elson J, 2007, CCS'07: PROCEEDINGS OF THE 14TH ACM CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, P366
[5]   Estimating degradation model parameters using neighborhood pattern distributions: An optimization approach [J].
Kanungo, T ;
Zheng, QG .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (04) :520-524
[6]   ELECTRONIC COMPOSITION AND THE TYPESETTER [J].
LONG, FA .
AMERICAN JOURNAL OF ECONOMICS AND SOCIOLOGY, 1993, 52 (02) :223-226
[7]  
Mori G, 2003, PROC CVPR IEEE, P134
[8]  
Thayananthan A, 2003, PROC CVPR IEEE, P127
[9]   Telling humans and computers apart automatically [J].
von Ahn, L ;
Blum, M ;
Langford, J .
COMMUNICATIONS OF THE ACM, 2004, 47 (02) :57-60
[10]  
von Ahn L, 2003, LECT NOTES COMPUT SC, V2656, P294