Time gap analysis by the topic model-based temporal technique

被引:31
作者
Jeong, Do-Heon [1 ]
Song, Min [2 ]
机构
[1] KISTI, Taejon 305806, South Korea
[2] Yonsei Univ, Seoul 120749, South Korea
基金
新加坡国家研究基金会;
关键词
Text mining; Topic modeling; Latent Dirichlet Allocation (LDA); Content analysis; Temporal analysis; Multiple resources; BIBLIOMETRIC ANALYSIS; INNOVATION; FIELD;
D O I
10.1016/j.joi.2014.07.005
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This study proposes a temporal analysis method to utilize heterogeneous resources such as papers, patents, and web news articles in an integrated manner. We analyzed the time gap phenomena between three resources and two academic areas by conducting text mining-based content analysis. To this end, a topic modeling technique, Latent Dirichlet Allocation (LDA) was used to estimate the optimal time gaps among three resources (papers, patents, and web news articles) in two research domains. The contributions of this study are summarized as follows: firstly, we propose a new temporal analysis method to understand the content characteristics and trends of heterogeneous multiple resources in an integrated manner. We applied it to measure the exact time intervals between academic areas by understanding the time gap phenomena. The results of temporal analysis showed that the resources of the medical field had more up-to-date property than those of the computer field, and thus prompter disclosure to the public. Secondly, we adopted a power-law exponent measurement and content analysis to evaluate the proposed method. With the proposed method, we demonstrate how to analyze heterogeneous resources more precisely and comprehensively. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:776 / 790
页数:15
相关论文
共 35 条
[1]   Trend detection through temporal link analysis [J].
Amitay, E ;
Carmel, D ;
Herscovici, M ;
Lempel, R ;
Soffer, A .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2004, 55 (14) :1270-1281
[2]  
[Anonymous], 2009, Proceeding of the 18th ACM Conference on Information and Knowledge Management, DOI DOI 10.1145/1645953.1646076
[3]  
[Anonymous], 2009, P 22 INT C NEUR INF
[4]   Bibliometric analysis - A new business area for information professionals in libraries? [J].
Ball, R ;
Tunger, D .
SCIENTOMETRICS, 2006, 66 (03) :561-577
[5]   Toward a basic framework for webometrics [J].
Björneborn, L ;
Ingwersen, P .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2004, 55 (14) :1216-1227
[6]  
Blei D.M., 2006, INT C MACHINE LEARNI, DOI DOI 10.1145/1143844.1143859
[7]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[8]   The Shift Towards Multi-Disciplinarity in Information Science [J].
Chua, Alton Y. K. ;
Yang, Christopher C. .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2008, 59 (13) :2156-2170
[9]   Forecasting emerging technologies: Use of bibliometrics and patent analysis [J].
Daim, Tugrul U. ;
Rueda, Guillenno ;
Martin, Hilary ;
Gerdsri, Pisek .
TECHNOLOGICAL FORECASTING AND SOCIAL CHANGE, 2006, 73 (08) :981-1012
[10]   Expansion of the field of informetrics: Origins and consequences [J].
Egghe, L .
INFORMATION PROCESSING & MANAGEMENT, 2005, 41 (06) :1311-1316