Using global unique identifiers to link autism collections

被引:61
作者
Johnson, Stephen B. [1 ,2 ]
Whitney, Glen [2 ]
McAuliffe, Matthew [3 ]
Wang, Hailong [3 ]
McCreedy, Evan [3 ]
Rozenblit, Leon [4 ]
Evans, Clark C. [4 ]
机构
[1] Columbia Univ, Dept Biomed Informat, New York, NY 10032 USA
[2] Simons Fdn, New York, NY USA
[3] NIH, Bethesda, MD 20892 USA
[4] Prometheus Res LLC, New Haven, CT USA
关键词
PERSONAL IDENTIFIERS;
D O I
10.1136/jamia.2009.002063
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective To propose a centralized method for generating global unique identifiers to link collections of research data and specimens. Design The work is a collaboration between the Simons Foundation Autism Research Initiative and the National Database for Autism Research. The system is implemented as a web service: an investigator inputs identifying information about a participant into a client application and sends encrypted information to a server application, which returns a generated global unique identifier. The authors evaluated the system using a volume test of one million simulated individuals and a field test on 2000 families (over 8000 individual participants) in an autism study. Measurements Inverse probability of hash codes; rate of false identity of two individuals; rate of false split of single individual; percentage of subjects for which identifying information could be collected; percentage of hash codes generated successfully. Results Large-volume simulation generated no false splits or false identity. Field testing in the Simons Foundation Autism Research Initiative Simplex Collection produced identifiers for 96% of children in the study and 77% of parents. On average, four out of five hash codes per subject were generated perfectly (only one perfect hash is required for subsequent matching). Discussion The system must achieve balance among the competing goals of distinguishing individuals, collecting accurate information for matching, and protecting confidentiality. Considerable effort is required to obtain approval from institutional review boards, obtain consent from participants, and to achieve compliance from sites during a multicenter study. Conclusion Generic unique identifiers have the potential to link collections of research data, augment the amount and types of data available for individuals, support detection of overlap between collections, and facilitate replication of research findings.
引用
收藏
页码:689 / 695
页数:7
相关论文
共 17 条
[1]   Advances in autism genetics: on the threshold of a new neurobiology [J].
Abrahams, Brett S. ;
Geschwind, Daniel H. .
NATURE REVIEWS GENETICS, 2008, 9 (05) :341-355
[2]  
[Anonymous], HDB EPIDEMIOLOGY
[3]   Replicating genotype-phenotype associations [J].
Chanock, Stephen J. ;
Manolio, Teri ;
Boehnke, Michael ;
Boerwinkle, Eric ;
Hunter, David J. ;
Thomas, Gilles ;
Hirschhorn, Joel N. ;
Abecasis, Goncalo ;
Altshuler, David ;
Bailey-Wilson, Joan E. ;
Brooks, Lisa D. ;
Cardon, Lon R. ;
Daly, Mark ;
Donnelly, Peter ;
Fraumeni, Joseph F., Jr. ;
Freimer, Nelson B. ;
Gerhard, Daniela S. ;
Gunter, Chris ;
Guttmacher, Alan E. ;
Guyer, Mark S. ;
Harris, Emily L. ;
Hoh, Josephine ;
Hoover, Robert ;
Kong, C. Augustine ;
Merikangas, Kathleen R. ;
Morton, Cynthia C. ;
Palmer, Lyle J. ;
Phimister, Elizabeth G. ;
Rice, John P. ;
Roberts, Jerry ;
Rotimi, Charles ;
Tucker, Margaret A. ;
Vogan, Kyle J. ;
Wacholder, Sholom ;
Wijsman, Ellen M. ;
Winn, Deborah M. ;
Collins, Francis S. .
NATURE, 2007, 447 (7145) :655-660
[4]  
CHURCHES T, 2003, BMC MED RES METHODOL, V6, P1
[5]   The Autism Genetic Resource Exchange: A resource for the study of autism and related neuropsychiatric conditions [J].
Geschwind, DH ;
Sowinski, J ;
Lord, C ;
Iversen, P ;
Shestack, J ;
Jones, P ;
Ducat, L ;
Spence, SJ .
AMERICAN JOURNAL OF HUMAN GENETICS, 2001, 69 (02) :463-466
[6]   Autism brain tissue banking [J].
Haroutunian, Vahram ;
Pickett, Jane .
BRAIN PATHOLOGY, 2007, 17 (04) :412-421
[7]   The autism genome project - Goals and strategies [J].
Hu-Lince, D ;
Craig, DW ;
Huentelman, MJ ;
Stephan, DA .
AMERICAN JOURNAL OF PHARMACOGENOMICS, 2005, 5 (04) :233-246
[8]   A national human neuroimaging collaboratory enabled by the Biomedical Informatics Research Network (BIRN) [J].
Keator, David B. ;
Grethe, J. S. ;
Marcus, D. ;
Ozyurt, B. ;
Gadde, S. ;
Murphy, Sean ;
Pieper, S. ;
Greve, D. ;
Notestine, R. ;
Bockholt, H. J. ;
Papadopoulos, P. .
IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, 2008, 12 (02) :162-172
[9]   The Zipper: a method for using personal identifiers to link data while preserving confidentiality [J].
Kruse, RL ;
Ewigman, BG ;
Tremblay, GC .
CHILD ABUSE & NEGLECT, 2001, 25 (09) :1241-1248
[10]   A self-sealing, distributed information architecture for public health, research, and clinical care [J].
McMurry, Andrew J. ;
Gilbert, Clint A. ;
Reis, Ben Y. ;
Chueh, Henry C. ;
Kohane, Isaac S. ;
Mandl, Kenneth D. .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2007, 14 (04) :527-533