Characterizing and Mining the Citation Graph of the Computer Science Literature

被引:33
作者
An, Yuan [3 ]
Janssen, Jeannette [2 ]
Milios, Evangelos E. [1 ]
机构
[1] Dalhousie Univ, Fac Comp Sci, Halifax, NS B3H 1W5, Canada
[2] Dalhousie Univ, Dept Math & Stat, Halifax, NS B3H 1W5, Canada
[3] Univ Toronto, Dept Comp Sci, Toronto, ON M5S 1A1, Canada
关键词
Citation graph; Graph connectivity; Networked information spaces; Power law; Small worlds;
D O I
10.1007/s10115-003-0128-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Citation graphs representing a body of scientific literature convey measures of scholarly activity and productivity. In this work we present a study of the structure of the citation graph of the computer science literature. Using a web robot we built several topic-specific citation graphs and their union graph from the digital library ResearchIndex. After verifying that the degree distributions follow a power law, we applied a series of graph theoretical algorithms to elicit an aggregate picture of the citation graph in terms of its connectivity. We discovered the existence of a single large weakly-connected and a single large biconnected component, and confirmed the expected lack of a large strongly-connected component. The large components remained even after removing the strongest authority nodes or the strongest hub nodes, indicating that such tight connectivity is widespread and does not depend on a small subset of important nodes. Finally, minimum cuts between authority papers of different areas did not result in a balanced partitioning of the graph into areas, pointing to the need for more sophisticated algorithms for clustering the graph.
引用
收藏
页码:664 / 678
页数:15
相关论文
共 14 条
[1]   Emergence of scaling in random networks [J].
Barabási, AL ;
Albert, R .
SCIENCE, 1999, 286 (5439) :509-512
[2]   Graph structure in the Web [J].
Broder, A ;
Kumar, R ;
Maghoul, F ;
Raghavan, P ;
Rajagopalan, S ;
Stata, R ;
Tomkins, A ;
Wiener, J .
COMPUTER NETWORKS-THE INTERNATIONAL JOURNAL OF COMPUTER AND TELECOMMUNICATIONS NETWORKING, 2000, 33 (1-6) :309-320
[3]   Mining the web's link structure [J].
Chakrabarti, S ;
Dom, BE ;
Kumar, SR ;
Raghavan, P ;
Rajagopalan, S ;
Tomkins, A ;
Gibson, D ;
Kleinberg, J .
COMPUTER, 1999, 32 (08) :60-+
[4]   Visualising semantic spaces and author co-citation networks in digital libraries [J].
Chen, CM .
INFORMATION PROCESSING & MANAGEMENT, 1999, 35 (03) :401-420
[5]  
Chen XY, 2001, CHINESE EDUC SOC, V34, P71
[6]  
DILL S, 2001, 27 INT C VER LARG DA
[7]   Fast approximate graph partitioning algorithms [J].
Even, G ;
Naor, JS ;
Rao, S ;
Schieber, B .
SIAM JOURNAL ON COMPUTING, 1999, 28 (06) :2187-2214
[8]  
Fiduccia C., 1982, P 19 IEEE DES AUT C, P175, DOI [DOI 10.1109/DAC.1982.1585498, 10.1109/DAC.1982.1585498]
[9]  
Garey M. R., 1976, Theoretical Computer Science, V1, P237, DOI 10.1016/0304-3975(76)90059-1