How much information do search engines disclose on the links to a web page? A longitudinal case study of the 'cybermetrics' home page

被引:17
作者
Bar-Ilan, J [1 ]
机构
[1] Hebrew Univ Jerusalem, Sch Lib Arch & Informat Studies, IL-91904 Jerusalem, Israel
关键词
D O I
10.1177/016555150202800602
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This study presents the results of an extensive search for links to the home page of the e-journal Cybermetrics. The results show that the search engines do not retrieve all the link pages that are indexed by them. In the specific case, the search engine Google concealed between 48 and 70% of the links to the page each time it was queried, and HotBot concealed between 20 and 39% of the link pages indexed by it. The queries were repeated four times during a one-year period, between January 2001 and January 2002 in order to rule out the possibility of an accidental finding. The other search engines examined also concealed some pages but to a much smaller extent. The findings raise questions about the use of WIF (the Web Impact Factor) as a scientometric indicator based on data retrieved from commercial search engines. The content of the retrieved and concealed pages was characterized using the method of content analysis. The characterization shows that the set of initially retrieved pages, and the set of initially retrieved pages plus the set of concealed pages, are significantly different for Google.
引用
收藏
页码:455 / 466
页数:12
相关论文
共 42 条
[1]  
AGUILLO I, 2000, S T 2000 LEID NETH M
[2]  
AGUILLO I, 2001, SIGMETRICS DISCUSSIO
[3]  
Aguillo I. F., 1998, Online Information 98. Proceedings, P239
[4]  
[Anonymous], 1997, CYBERMETRICS
[5]   On the overlap, the precision and estimated recall of search engines, a case study of the query "Erdos" [J].
Bar-Ilan, J .
SCIENTOMETRICS, 1998, 42 (02) :207-228
[6]   Results of an extensive search for "S&T indicators" on the Web: A content analysis [J].
Bar-Ilan, J .
SCIENTOMETRICS, 2000, 49 (02) :257-277
[7]   Methods for measuring search engine performance over time [J].
Bar-Ilan, J .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2002, 53 (04) :308-319
[8]   Evaluating the stability of the search tools Hotbot and Snap: a case study [J].
Bar-Ilan, J .
ONLINE INFORMATION REVIEW, 2000, 24 (06) :439-449
[9]   Data collection methods on the Web for informetric purposes - A review and analysis [J].
Bar-Ilan, J .
SCIENTOMETRICS, 2001, 50 (01) :7-32
[10]  
Bar-Ilan J., 1999, CYBERMETRICS, V2