A context-sensitive approach to anonymizing spatial surveillance data: Impact on outbreak detection

被引:70
作者
Cassa, CA
Grannis, SJ
Overhage, JM
Mandl, KD
机构
[1] Childrens Hosp Boston, Informat Program, Mandl Grp, Boston, MA 02215 USA
[2] MIT, Clin Decis Making Grp, Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA
[3] Harvard Univ, MIT, Div Hlth Sci & Technol, Cambridge, MA 02138 USA
[4] Indiana Univ, Sch Med, Indianapolis, IN 46204 USA
[5] Regenstrief Inst Inc, Indianapolis, IN USA
[6] Harvard Univ, Sch Med, Boston, MA 02115 USA
关键词
D O I
10.1197/jamia.M1920
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective: The use of spatially based methods and algorithms in epidemiology and surveillance presents privacy challenges for researchers and public health agencies. We describe a novel method for anonymizing individuals in public health data sets by transposing their spatial locations through a process informed by the underlying population density. Further, we measure the impact of the skew on detection of spatial clustering as measured by a spatial scanning statistic. Design: Cases were emergency department (ED) visits for respiratory illness. Baseline ED visit data were injected with artificially created clusters ranging in magnitude, shape, and location. The geocoded locations were then transformed using a de-identification algorithm that accounts for the local underlying population density. Measurements: A total of 12,600 separate weeks of case data with artificially created clusters were combined with control data and the impact on detection of spatial clustering identified by a spatial scan statistic was measured. Results: The anonymization algorithm produced an expected skew of cases that resulted in high values of data set k-anonymity. De-identification that moves points an average distance of 0.25 km lowers the spatial cluster detection sensitivity by less than 4% and lowers the detection specificity less than 1%. Conclusion: A population-density-based Gaussian spatial blurring markedly decreases the ability to identify individuals in a data set while only slightly decreasing the performance of a standardly used outbreak detection tool. These findings suggest new approaches to anonymizing data for spatial epidemiology and surveillance.
引用
收藏
页码:160 / 165
页数:6
相关论文
共 16 条
[11]   Protecting patient privacy by quantifiable control of disclosures in disseminated databases [J].
Ohno-Machado, L ;
Silveira, PSP ;
Vinterbo, S .
INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2004, 73 (7-8) :599-606
[12]   Real time spatial cluster detection using interpoint distances among precise patient locations [J].
Olson K.L. ;
Bonetti M. ;
Pagano M. ;
Mandl K.D. .
BMC Medical Informatics and Decision Making, 5 (1)
[13]  
Sweeney L, 1996, Proc AMIA Annu Fall Symp, P333
[14]  
Sweeney L, 1997, J AM MED INFORM ASSN, P51
[15]   k-anonymity:: A model for protecting privacy [J].
Sweeney, L .
INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2002, 10 (05) :557-570
[16]  
*US CENS BUR, 2005, CENS BLOCK GROUPS CA