Disease Comorbidity Linkages between MEDLINE and Patient Data

被引:2
作者
Anupindi, Tejaswi Rohit [1 ]
Srinivasan, Padmini [1 ]
机构
[1] Univ Iowa, Comp Sci, Iowa City, IA 52246 USA
来源
2017 IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI) | 2017年
关键词
D O I
10.1109/ICHI.2017.48
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an analysis of a class of inferred links between MEDLINE and patient data. Records in the two datasets are linked via pairs of disease associations with a view to emphasizing disease comorbidities. In MEDLINE disease pairs are extracted by mining specific patterns such as MeSH disease term 1/etiology and MeSH disease term 2/complications. 701,780 pairs are extracted by our pattern set from a 2017 download of MEDLINE with close to 27 million records. The patient data, obtained from another study, has 6,088,553 disease cooccurrence pairs. Our methodology to infer connections involves mapping ICD9 codes and MeSH terms to UMLS concept ids followed by both exact and approximate matching strategies. The approximate matching strategy involves semantic relations present in the UMLS. We are able to connect 2,478,366 patient disease pairs encoded using 5 digit ICD9 codes to MEDLINE pairs (and therefore to the corresponding documents) and 536,685 MEDLINE disease pairs onto the patient disease pairs (and therefore implicitly to the corresponding patient records). While these numbers are large the percentages are between 43% and 77%. This indicates that other approaches for linking the two datasets would be of interest. Moreover, comorbidity is a particular viewpoint among many options. We suggest that the study of inferred links between biomedical datasets - especially between core datasets - is of great value in terms of enriching the biomedical web of knowledge.
引用
收藏
页码:403 / 408
页数:6
相关论文
共 15 条
[1]  
Ambert K H, 2009, J AM MED INFORM ASSN, V16, P5905
[2]  
CIMINO JJ, 1993, METHOD INFORM MED, V32, P120
[3]  
Hanauer David A, 2014, J AM MED INFORM ASSN, V21
[4]  
Hidalgo Cesar A., 2009, PLOS COMPUTATIONAL B
[5]  
Hristovski D, 2001, STUD HEALTH TECHNOL, V84, P1344
[6]   The "etiome": identification and clustering of human disease etiological factors [J].
Liu, Yueyi I. ;
Wise, Paul H. ;
Butte, Atul J. .
BMC BIOINFORMATICS, 2009, 10
[7]   AUTOMATED BIBLIOGRAPHIC RETRIEVAL BASED ON CURRENT TOPICS IN HEPATOLOGY - HEPATOPIX [J].
POWSNER, SM ;
RIELY, CA ;
BARWICK, KW ;
MORROW, JS ;
MILLER, PL .
COMPUTERS AND BIOMEDICAL RESEARCH, 1989, 22 (06) :552-564
[8]  
Roberts Kirk, P TREC 2015 NIST
[9]   Two Similarity Metrics for Medical Subject Headings (MeSH): An Aid to Biomedical Text Mining and Author Name Disambiguation [J].
Smalheiser, Neil R. ;
Bonifield, Gary .
JOURNAL OF BIOMEDICAL DISCOVERY AND COLLABORATION, 2016, 7
[10]  
Srinivasan P, 2001, J AM MED INFORM ASSN, P642