Co-occurrence matrices and their applications in information science: Extending ACA to the Web environment

被引:285
作者
Leydesdorff, Loet [1 ]
Vaughan, Liwen
机构
[1] Univ Lausanne, Sch Econ HEC, CH-1015 Lausanne, Switzerland
[2] Univ Amsterdam, Amsterdam Sch Commun Res, NL-1012 CX Amsterdam, Netherlands
[3] Univ Western Ontario, Fac Informat & Media Studies, London, ON N6A 5B7, Canada
来源
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY | 2006年 / 57卷 / 12期
关键词
D O I
10.1002/asi.20335
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Co-occurrence matrices, such as cocitation, coword, and colink matrices, have been used widely in the information sciences. However, confusion and controversy have hindered the proper statistical analysis of these data. The underlying problem, in our opinion, involved understanding the nature of various types of matrices. This article discusses the difference between a symmetrical cocitation matrix and an asymmetrical citation matrix as well as the appropriate statistical techniques that can be applied to each of these matrices, respectively. Similarity measures (such as the Pearson correlation coefficient or the cosine) should not be applied to the symmetrical cocitation matrix but can be applied to the asymmetrical citation matrix to derive the proximity matrix. The argument is illustrated with examples. The study then extends the application of co-occurrence matrices to the Web environment, in which the nature of the available data and thus data collection methods are different from those of traditional databases such as the Science Citation Index. A set of data collected with the Google Scholar search engine is analyzed by using both the traditional methods of multivariate analysis and the new visualization software Pajek, which is based on social network analysis and graph theory.
引用
收藏
页码:1616 / 1628
页数:13
相关论文
共 20 条
[1]   Requirements for a cocitation similarity measure, with special reference to Pearson's correlation coefficient [J].
Ahlgren, P ;
Jarneving, B ;
Rousseau, R .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2003, 54 (06) :550-560
[2]  
Ahlgren P, 2004, J AM SOC INF SCI TEC, V55, P843, DOI 10.1002/asi.20030
[3]   Rejoinder: In defense of formal methods [J].
Ahlgren, P ;
Jarneving, B ;
Rousseau, R .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2004, 55 (10) :936-936
[4]   Pearson's r and author cocitation analysis:: A commentary on the controversy [J].
Bensman, SJ .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2004, 55 (10) :935-935
[5]  
JONES WP, 1987, J AM SOC INFORM SCI, V38, P420, DOI 10.1002/(SICI)1097-4571(198711)38:6<420::AID-ASI3>3.0.CO
[6]  
2-S
[7]   AN ALGORITHM FOR DRAWING GENERAL UNDIRECTED GRAPHS [J].
KAMADA, T ;
KAWAI, S .
INFORMATION PROCESSING LETTERS, 1989, 31 (01) :7-15
[8]   WORDS AND CO-WORDS AS INDICATORS OF INTELLECTUAL ORGANIZATION [J].
LEYDESDORFF, L .
RESEARCH POLICY, 1989, 18 (04) :209-223
[9]  
LEYDESDORFF L, 1987, SCIENTOMETRICS, V11, P291
[10]   COCITATION IN SCIENTIFIC LITERATURE - NEW MEASURE OF RELATIONSHIP BETWEEN 2 DOCUMENTS [J].
SMALL, H .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1973, 24 (04) :265-269