Clustering and identifying temporal trends in document databases

被引:41
作者
Popescul, A [1 ]
Flake, GW [1 ]
Lawrence, S [1 ]
Ungar, LH [1 ]
Giles, CL [1 ]
机构
[1] Univ Penn, Dept Comp & Informat Sci, Philadelphia, PA 19104 USA
来源
IEEE ADVANCES IN DIGITAL LIBRARIES 2000, PROCEEDINGS | 2000年
关键词
D O I
10.1109/ADL.2000.848380
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We introduce a simple and efficient method for clustering and identifying temporal trends in hyper-linked document databases. Our method cart scale to large datasets because it exploits the underlying regularity often found in hyper-linked document databases. Because of this scalability, we can use our method to study the temporal trends of individual clusters in a statistically meaningful manner As an example of our approach, we give a summary of the temporal trends found in a scientific literature database with thousands of documents.
引用
收藏
页码:173 / 182
页数:10
相关论文
共 14 条
[1]  
[Anonymous], P ACM SIGCHI C HUM F
[2]  
Chen C, 1999, P 10 ACM C HYP HYP R, P51
[3]  
Garfield E., 1979, CITATION INDEXING IT
[4]  
KAUFMAN L, 1990, FINDIGS GROUPS DATA
[5]  
KRUSKAL JB, 1978, MULTIDIMENSIOAL SCAL
[6]   Indexing and retrieval of scientific literature [J].
Lawrence, S ;
Bollacker, K ;
Giles, CL .
PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON INFORMATION KNOWLEDGE MANAGEMENT, CIKM'99, 1999, :139-146
[7]   Digital libraries and autonomous citation indexing [J].
Lawrence, S ;
Giles, CL ;
Bollacker, K .
COMPUTER, 1999, 32 (06) :67-+
[8]  
MacQueen J., 1967, Proc fifth Berkeley Symp Math Stat Probab, V1, P281
[9]  
MCCAIN K, 1990, MAPPING AUTHORS INTE, P194
[10]  
RAGHUPATHI W, 1990, INTELLIGENCE SUM, P18