Weighted Hybrid Clustering by Combining Text Mining and Bibliometrics on a Large-Scale Journal Database

被引:54
作者
Liu, Xinhai [1 ,2 ]
Yu, Shi [1 ]
Janssens, Frizo [1 ]
Glanzel, Wolfgang [3 ,4 ]
Moreau, Yves [1 ]
De Moor, Bart [1 ]
机构
[1] Katholieke Univ Leuven, ESAT SCD, B-3001 Leuven, Belgium
[2] Wuhan Univ Sci & Technol, Coll Informat Sci & Engn, Wuhan 430081, Hubei, Peoples R China
[3] Katholieke Univ Leuven, Ctr R&D Monitoring, Dept Managerial Econ Strategy & Innovat, B-3000 Leuven, Belgium
[4] Hungarian Acad Sci, IRPS, Budapest, Hungary
来源
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY | 2010年 / 61卷 / 06期
关键词
COMBINED COCITATION; WORD ANALYSIS; SCIENCE; INFORMATION; CONSENSUS;
D O I
10.1002/asi.21312
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose a new hybrid clustering framework to incorporate text mining with bibliometrics in journal set analysis. The framework integrates two different approaches: clustering ensemble and kernel-fusion clustering. To improve the flexibility and the efficiency of processing large-scale data, we propose an information-based weighting scheme to leverage the effect of multiple data sources in hybrid clustering. Three different algorithms are extended by the proposed weighting scheme and they are employed on a large journal set retrieved from the Web of Science (WoS) database. The clustering performance of the proposed algorithms is systematically evaluated using multiple evaluation methods, and they were cross-compared with alternative methods. Experimental results demonstrate that the proposed weighted hybrid clustering strategy is superior to other methods in clustering performance and efficiency. The proposed approach also provides a more refined structural mapping of journal sets, which is useful for monitoring and detecting new trends in different scientific fields.
引用
收藏
页码:1105 / 1119
页数:15
相关论文
共 36 条
[1]  
[Anonymous], 1993, An introduction to the bootstrap
[2]   Cumulative voting consensus method for partitions with a variable number of clusters [J].
Ayad, Hanan G. ;
Kamel, Mohamed S. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008, 30 (01) :160-173
[3]  
Batagelj V, 2004, MATH VIS, P77
[4]   Multi-view clustering [J].
Bickel, S ;
Scheffer, T .
FOURTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2004, :19-26
[5]   Mapping the structure and evolution of chemistry research [J].
Boyack, Kevin W. ;
Boerner, Katy ;
Klavans, Richard .
SCIENTOMETRICS, 2009, 79 (01) :45-60
[6]  
BRAAM RR, 1991, J AM SOC INFORM SCI, V42, P233, DOI 10.1002/(SICI)1097-4571(199105)42:4<233::AID-ASI1>3.0.CO
[7]  
2-I
[8]  
BRAAM RR, 1991, J AM SOC INFORM SCI, V42, P252, DOI 10.1002/(SICI)1097-4571(199105)42:4<252::AID-ASI2>3.0.CO
[9]  
2-G
[10]   The anatomy of a large-scale hypertextual Web search engine [J].
Brin, S ;
Page, L .
COMPUTER NETWORKS AND ISDN SYSTEMS, 1998, 30 (1-7) :107-117