Investigation of the accuracy of search engine hit counts

被引:38
作者
Uyar, Ahmet [1 ]
机构
[1] Mersin Univ, Muhendislik Fak, Bilgisayar Muhendisli Bolumu, Mersin, Turkey
关键词
document indexing and counting; search engine evaluation; webometrics; WEB;
D O I
10.1177/0165551509103598
中图分类号
TP [自动化技术、计算机技术];
学科分类号
080201 [机械制造及其自动化];
摘要
This study investigates the accuracy of search engine hit counts for search queries. We investigate the accuracy of hit counts for Google, Yahoo and Microsoft Live Search, and the accuracy of single and multiple term queries. In addition, we investigate the consistency of hit count estimates for 15 days. The results show that all three provide estimates for the number of matching documents and the estimation patterns of their counting algorithms differ greatly. The accuracy of hit counts for multiple word queries has not been studied before. The results of our study show that the number of words in queries affects the accuracy of estimations significantly. The percentages of accurate hit count estimations are reduced almost by half when going from single word to two word query tests in all three search engines. With the increase in the number of query words, the error in estimation increases and the number of accurate estimations decreases.
引用
收藏
页码:469 / 480
页数:12
相关论文
共 28 条
[1]
Scientific research activity and communication measured with cybermetrics indicators [J].
Aguillo, Isidro F. ;
Granadino, Begona ;
Ortega, Jose L. ;
Prieto, Jose A. .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2006, 57 (10) :1296-1302
[2]
Anagnostopoulos A., 2005, WWW '05: Proceedings of the 14th International Conference on the World Wide Web, P245
[3]
[Anonymous], COMPUTER NETWORKS IS
[4]
How do search engines respond to some non-English queries? [J].
Bar-Ilan, J ;
Gutman, T .
JOURNAL OF INFORMATION SCIENCE, 2005, 31 (01) :13-28
[5]
The use of Web search engines in information science research [J].
Bar-Ilan, J .
ANNUAL REVIEW OF INFORMATION SCIENCE AND TECHNOLOGY, 2004, 38 :231-288
[6]
Evolution, continuity, and disappearance of documents on a specific topic on the web: A longitudinal study of "informetrics" [J].
Bar-Ilan, J ;
Peritz, BC .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2004, 55 (11) :980-990
[7]
Bar-Ilan J., 1999, CYBERMETRICS, V2
[8]
Web search for a planet:: The Google cluster architecture [J].
Barroso, LA ;
Dean, J ;
Hölzle, U .
IEEE MICRO, 2003, 23 (02) :22-28
[9]
Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
[10]
Fetterly D, 2003, P 12 INT C WORLD WID