A context-sensitive approach to anonymizing spatial surveillance data: Impact on outbreak detection

被引:70
作者
Cassa, CA
Grannis, SJ
Overhage, JM
Mandl, KD
机构
[1] Childrens Hosp Boston, Informat Program, Mandl Grp, Boston, MA 02215 USA
[2] MIT, Clin Decis Making Grp, Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA
[3] Harvard Univ, MIT, Div Hlth Sci & Technol, Cambridge, MA 02138 USA
[4] Indiana Univ, Sch Med, Indianapolis, IN 46204 USA
[5] Regenstrief Inst Inc, Indianapolis, IN USA
[6] Harvard Univ, Sch Med, Boston, MA 02115 USA
关键词
D O I
10.1197/jamia.M1920
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective: The use of spatially based methods and algorithms in epidemiology and surveillance presents privacy challenges for researchers and public health agencies. We describe a novel method for anonymizing individuals in public health data sets by transposing their spatial locations through a process informed by the underlying population density. Further, we measure the impact of the skew on detection of spatial clustering as measured by a spatial scanning statistic. Design: Cases were emergency department (ED) visits for respiratory illness. Baseline ED visit data were injected with artificially created clusters ranging in magnitude, shape, and location. The geocoded locations were then transformed using a de-identification algorithm that accounts for the local underlying population density. Measurements: A total of 12,600 separate weeks of case data with artificially created clusters were combined with control data and the impact on detection of spatial clustering identified by a spatial scan statistic was measured. Results: The anonymization algorithm produced an expected skew of cases that resulted in high values of data set k-anonymity. De-identification that moves points an average distance of 0.25 km lowers the spatial cluster detection sensitivity by less than 4% and lowers the detection specificity less than 1%. Conclusion: A population-density-based Gaussian spatial blurring markedly decreases the ability to identify individuals in a data set while only slightly decreasing the performance of a standardly used outbreak detection tool. These findings suggest new approaches to anonymizing data for spatial epidemiology and surveillance.
引用
收藏
页码:160 / 165
页数:6
相关论文
共 16 条
[1]  
Armstrong MP, 1999, STAT MED, V18, P497, DOI 10.1002/(SICI)1097-0258(19990315)18:5<497::AID-SIM45>3.0.CO
[2]  
2-#
[3]   Use of emergency department chief complaint and diagnostic codes for identifying respiratory illness in a pediatric population [J].
Beitel, AJ ;
Olson, KL ;
Reis, BY ;
Mandl, KD .
PEDIATRIC EMERGENCY CARE, 2004, 20 (06) :355-360
[4]   Algorithms for rapid outbreak detection: a research synthesis [J].
Buckeridge, DL ;
Burkom, H ;
Campbell, M ;
Hogan, WR ;
Moore, AW .
JOURNAL OF BIOMEDICAL INFORMATICS, 2005, 38 (02) :99-113
[5]  
Cassa C., 2004, MORB MORTAL WKLY R S, V53, P231
[6]  
DOCUMENTATION SJ, 2005, RANDOM CLASS NEXT GA
[7]   A space-time permutation scan statistic for disease outbreak detection [J].
Kulldorff, M ;
Heffernan, R ;
Hartman, J ;
Assunçao, R ;
Mostashari, F .
PLOS MEDICINE, 2005, 2 (03) :216-224
[8]   SPATIAL DISEASE CLUSTERS - DETECTION AND INFERENCE [J].
KULLDORFF, M ;
NAGARWALLA, N .
STATISTICS IN MEDICINE, 1995, 14 (08) :799-810
[9]   Implementing syndromic surveillance: A practical guide informed by the early experience [J].
Mandl, KD ;
Overhage, JM ;
Wagner, MM ;
Lober, WB ;
Sebastiani, P ;
Mostashari, F ;
Pavlin, JA ;
Gesteland, PH ;
Treadwell, T ;
Koski, E ;
Hutwagner, L ;
Buckeridge, DL ;
Aller, RD ;
Grannis, S .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2004, 11 (02) :141-150
[10]  
Mandl Kenneth D, 2004, MMWR Suppl, V53, P130