OMG U got flu? Analysis of shared health messages for bio-surveillance

被引:56
作者
Collier N. [1 ,2 ]
Son N.T. [3 ]
Nguyen N.M. [3 ]
机构
[1] National Institute of Informatics, 2-1-2 Hitotsubashi, Tokyo, Chiyoda-ku
[2] Japan Science and Technology Agency, 2-1-2 Hitotsubashi, Tokyo, Chiyoda-ku
[3] Vietnam National University at HCMC, Ho Chi Minh City
关键词
Influenza; Avian Influenza; Influenza Season; Influenza Like Illness; Sentinel Network;
D O I
10.1186/2041-1480-2-S5-S9
中图分类号
学科分类号
摘要
Background: Micro-blogging services such as Twitter offer the potential to crowdsource epidemics in real-time. However, Twitter posts ('tweets') are often ambiguous and reactive to media trends. In order to ground user messages in epidemic response we focused on tracking reports of self-protective behaviour such as avoiding public gatherings or increased sanitation as the basis for further risk analysis. Results: We created guidelines for tagging self protective behaviour based on Jones and Salathé (2009)'s behaviour response survey. Applying the guidelines to a corpus of 5283 Twitter messages related to influenza like illness showed a high level of inter-annotator agreement (kappa 0.86). We employed supervised learning using unigrams, bigrams and regular expressions as features with two supervised classifiers (SVM and Naive Bayes) to classify tweets into 4 self-reported protective behaviour categories plus a self-reported diagnosis. In addition to classification performance we report moderately strong Spearman's Rho correlation by comparing classifier output against WHO/NREVSS laboratory data for A(H1N1) in the USA during the 2009-2010 influenza season. Conclusions: The study adds to evidence supporting a high degree of correlation between pre-diagnostic social media signals and diagnostic influenza case data, pointing the way towards low cost sensor networks. We believe that the signals we have modelled may be applicable to a wide range of diseases. © 2011 Collier et al; licensee BioMed Central Ltd.
引用
收藏
相关论文
共 22 条
[1]  
Earle P., Earthquake Twitter, Nature Geoscience, 3, 4, pp. 221-222, (2010)
[2]  
Sakaki T., Okazaki M., Matsuo Y., Earthquake shakes twitter users: real-time event detection by social sensors., Proc. of the 19th International World Wide Web Conference, pp. 851-860, (2010)
[3]  
Dalton C., Durrheim D., Fejsa J., Francis L., Carlson S., ursan d'Espaignet E., Flutracking: A weekly Australian community online survey of influenza-like illness in 2006, 2007 and 2008, Communicable Disease Intelligence, 33, 3, pp. 316-322, (2009)
[4]  
Okolloh O., Ushahidi, or 'testimony': Web 2.0 tools for crowdsourcing crisis information, Participatory Learning and Action, 59, pp. 65-70, (2009)
[5]  
Cheng C.K., Lau E.H., Ip D.K., Yeung A.S., Ho L.M., Cowling B.J., A profile of the online dissemination of national influenza surveillance data, BMC Public Health, 9, (2009)
[6]  
Hartley D., Nelson N., Walters R., Arthur R., Yangarber R., Madoff L., Linge J., Mawudeku A., Collier N., Brownstein J., Thinus G., Lightfoot N., The landscape of international biosurveillance., Emerging Health Threats J., 3, e3, (2010)
[7]  
Collier N., What's unusual in online disease outbreak news?., Biomedical Semantics., 1, (2010)
[8]  
Ginsberg J., Mohebbi M., Patel R., Brammer L., Smolinski M., Brilliant L., Detecting influenza epidemics using search engine query data, Nature, 457, pp. 1012-1014, (2009)
[9]  
Jones J., Salathe M., Early assessment of anxiety and behavioral response to novel swine-origin influenza A(H1N1), PLoS One, 4, 12, (2009)
[10]  
Lampos V., De Bie T., Cristianini N., Flu Detector - Tracking Epidemics on Twitter., Machine Learning and Knowledge Discovery in Databases, 6223, 2010, pp. 599-602, (2010)