Text and Structural Data Mining of Influenza Mentions in Web and Social Media

被引:139
作者
Corley, Courtney D. [1 ]
Cook, Diane J. [2 ]
Mikler, Armin R. [3 ]
Singh, Karan P. [4 ]
机构
[1] Pacific NW Natl Lab, Richland, WA 99352 USA
[2] Washington State Univ, Sch Elect Engn & Comp Sci, Pullman, WA 99164 USA
[3] Univ N Texas, Dept Comp Sci & Engn, Denton, TX 76203 USA
[4] Univ N Texas, Hlth Sci Ctr, Dept Biostat, Ft Worth, TX 76107 USA
基金
美国国家科学基金会;
关键词
disease surveillance; public health epidemiology; health informatics; graph-based data mining; web and social media; social network analysis;
D O I
10.3390/ijerph7020596
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Text and structural data mining of web and social media (WSM) provides a novel disease surveillance resource and can identify online communities for targeted public health communications (PHC) to assure wide dissemination of pertinent information. WSM that mention influenza are harvested over a 24-week period, 5 October 2008 to 21 March 2009. Link analysis reveals communities for targeted PHC. Text mining is shown to identify trends in flu posts that correlate to real-world influenza-like illness patient report data. We also bring to bear a graph-based data mining technique to detect anomalies among flu blogs connected by publisher type, links, and user-tags.
引用
收藏
页码:596 / 615
页数:20
相关论文
共 24 条
[1]   The anatomy of a large-scale hypertextual Web search engine [J].
Brin, S ;
Page, L .
COMPUTER NETWORKS AND ISDN SYSTEMS, 1998, 30 (1-7) :107-117
[2]  
*CDCP, INFL SURV REP
[3]   Substructure Discovery Using Minimum Description Length and Background Knowledge [J].
Cook, Diane J. ;
Holder, Lawrence B. .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1993, 1 :231-255
[4]  
CORLEY C, 2009, P 2009 INT C BIOINF
[5]  
*DHHS, DHHS PUBL 10
[6]   Anomaly detection in data represented as graphs [J].
Eberle, William ;
Holder, Lawrence .
INTELLIGENT DATA ANALYSIS, 2007, 11 (06) :663-689
[7]  
EYSENBACH G, 2005, P AMIA ANN S, P244
[8]  
Feldman Ronen., 2007, TEXT MINING HDB
[9]  
FLAKE G, 2000, P 6 ACM SIGKDD INT C
[10]   Detecting influenza epidemics using search engine query data [J].
Ginsberg, Jeremy ;
Mohebbi, Matthew H. ;
Patel, Rajan S. ;
Brammer, Lynnette ;
Smolinski, Mark S. ;
Brilliant, Larry .
NATURE, 2009, 457 (7232) :1012-U4