UK phenomics platform for developing and validating electronic health record phenotypes: CALIBER

被引:125
作者
Denaxas, Spiros [1 ,2 ,7 ,8 ,9 ]
Gonzalez-Izquierdo, Arturo [1 ,2 ,8 ]
Direk, Kenan [1 ,2 ,8 ]
Fitzpatrick, Natalie K. [1 ,2 ]
Fatemifar, Ghazaleh [1 ,2 ]
Banerjee, Amitava [1 ,2 ,9 ]
Dobson, Richard J. B. [1 ,2 ,3 ,8 ,9 ]
Howe, Laurence J. [4 ]
Kuan, Valerie [2 ,4 ]
Lumbers, R. Tom [1 ,2 ,9 ]
Pasea, Laura [1 ,2 ]
Patel, Riyaz S. [4 ,9 ]
Shah, Anoop D. [1 ,2 ,9 ]
Hingorani, Aroon D. [2 ,4 ]
Sudlow, Cathie [5 ,6 ]
Hemingway, Harry [1 ,2 ,8 ,9 ]
机构
[1] UCL, Inst Hlth Informat, 222 Euston Rd, London NW1 2DA, England
[2] Hlth Data Res UK, London, England
[3] Kings Coll London, Dept Biostat & Hlth Informat, Inst Psychiat Psychol & Neurosci, London, England
[4] UCL, Inst Cardiovasc Sci, London, England
[5] Univ Edinburgh, Ctr Med Informat, Usher Inst Populat Hlth Sci & Informat, Edinburgh, Midlothian, Scotland
[6] Hlth Data Res UK, Edinburgh, Midlothian, Scotland
[7] Alan Turing Inst, London, England
[8] UCL, Natl Inst Hlth Res, Biomed Res Ctr, Univ Coll London Hosp, London, England
[9] UCL, British Heart Fdn Res Accelerator, London, England
基金
英国医学研究理事会; 英国惠康基金; 欧盟地平线“2020”; 英国经济与社会研究理事会; 英国工程与自然科学研究理事会;
关键词
electronic health records; phenotyping; medical informatics; personalized medicine; PRACTICE RESEARCH DATABASE; 12; CARDIOVASCULAR-DISEASES; ACUTE MYOCARDIAL-INFARCTION; GENOME-WIDE ASSOCIATION; PRIMARY-CARE; MEDICAL-RECORDS; LIFETIME RISKS; LINKAGE; COHORT; PRESENTATIONS;
D O I
10.1093/jamia/ocz105
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective: Electronic health records (EHRs) are a rich source of information on human diseases, but the information is variably structured, fragmented, curated using different coding systems, and collected for purposes other than medical research. We describe an approach for developing, validating, and sharing reproducible phenotypes from national structured EHR in the United Kingdom with applications for translational research. Materials and Methods: We implemented a rule-based phenotyping framework, with up to 6 approaches of validation. We applied our framework to a sample of 15 million individuals in a national EHR data source (population-based primary care, all ages) linked to hospitalization and death records in England. Data comprised continuous measurements (for example, blood pressure; medication information; coded diagnoses, symptoms, procedures, and referrals), recorded using 5 controlled clinical terminologies: (1) read (primary care, subset of SNOMED-CT [Systematized Nomenclature of Medicine Clinical Terms]), (2) International Classification of Diseases-Ninth Revision and Tenth Revision (secondary care diagnoses and cause of mortality), (3) Office of Population Censuses and Surveys Classification of Surgical Operations and Procedures, Fourth Revision (hospital surgical procedures), and (4) DM+D prescription codes. Results: Using the CALIBER phenotyping framework, we created algorithms for 51 diseases, syndromes, bio-markers, and lifestyle risk factors and provide up to 6 validation approaches. The EHR phenotypes are curated in the open-access CALIBER Portal (https://www.caliberresearch.org/portal) and have been used by 40 national and international research groups in 60 peer-reviewed publications. Conclusions: We describe a UK EHR phenomics approach within the CALIBER EHR data platform with initial evidence of validity and use, as an important step toward international use of UK EHR data for health research.
引用
收藏
页码:1545 / 1559
页数:15
相关论文
共 94 条
[1]   Defining asthma and assessing asthma outcomes using electronic health record data: a systematic scoping review [J].
Al Sallakh, Mohammad A. ;
Vasileiou, Eleftheria ;
Rodgers, Sarah E. ;
Lyons, Ronan A. ;
Sheikh, Aziz ;
Davies, Gwyneth A. .
EUROPEAN RESPIRATORY JOURNAL, 2017, 49 (06)
[2]  
American Medical Association, 2007, CURR PROC TERM CPT CURR PROC TERM CPT
[3]  
[Anonymous], BJGP OPEN
[4]  
[Anonymous], 1993, ICD 10 ICD 10 CLASSI
[5]  
[Anonymous], 2013, BMJ BRIT MED J, DOI DOI 10.1136/BMJ.F2350
[6]   Clinically recorded heart rate and incidence of 12 coronary, cardiac, cerebrovascular and peripheral arterial diseases in 233,970 men and women: A linked electronic health record study [J].
Archangelidi, Olga ;
Pujades-Rodriguez, Mar ;
Timmis, Adam ;
Jouven, Xavier ;
Denaxas, Spiros ;
Hemingway, Harry .
EUROPEAN JOURNAL OF PREVENTIVE CARDIOLOGY, 2018, 25 (14) :1485-1495
[7]  
Aronson AR, 2001, J AM MED INFORM ASSN, P17
[8]   Advances in Electronic Phenotyping: From Rule-Based Definitions to Machine Learning Models [J].
Banda, Juan M. ;
Seneviratne, Martin ;
Hernandez-Boussard, Tina ;
Shah, Nigam H. .
ANNUAL REVIEW OF BIOMEDICAL DATA SCIENCE, VOL 1, 2018, 1 :53-68
[9]   Association between clinically recorded alcohol consumption and initial presentation of 12 cardiovascular diseases: population based cohort study using linked health records [J].
Bell, Steven ;
Daskalopoulou, Marina ;
Rapsomaniki, Eleni ;
George, Julie ;
Britton, Annie ;
Bobak, Martin ;
Casas, Juan P. ;
Dale, Caroline E. ;
Denaxas, Spiros ;
Shah, Anoop D. ;
Hemingway, Harry .
BMJ-BRITISH MEDICAL JOURNAL, 2017, 356
[10]  
Bender D, 2013, COMP MED SY, P326, DOI 10.1109/CBMS.2013.6627810