A System for Classifying Disease Comorbidity Status from Medical Discharge Summaries Using Automated Hotspot and Negated Concept Detection

被引:21
作者
Ambert, Kyle H. [1 ]
Cohen, Aaron M. [1 ]
机构
[1] Oregon Hlth & Sci Univ, Dept Med Informat & Clin Epidemiol, Portland, OR 97239 USA
关键词
D O I
10.1197/jamia.M3095
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective: Free-text clinical reports serve as an important part of patient care management and clinical documentation of patient disease and treatment status. Free-text notes are commonplace in medical practice, but remain an under-used source of information for clinical and epidemiological research, as well as personalized medicine. The authors explore the challenges associated with automatically extracting information from clinical reports using their submission to the Integrating Informatics with Biology and the Bedside (i2b2) 2008 Natural Language Processing Obesity Challenge Task. Design: A text mining system for classifying patient comorbidity status, based on the information contained in clinical reports. The approach of the authors incorporates a variety of automated techniques, including hot-spot filtering, negated concept identification, zero-vector filtering, weighting by inverse class-frequency, and error-correcting of output codes with linear support vector machines. Measurements: Performance was evaluated in terms of the macroaveraged F1 measure. Results: The automated system performed well against manual expert rule-based systems, finishing fifth in the Challenge's intuitive task, and 13(th) in the textual task. Conclusions: The system demonstrates that effective comorbidity status classification by an automated system is possible.
引用
收藏
页码:590 / 595
页数:6
相关论文
共 10 条
[1]  
[Anonymous], 2000, NATURE STAT LEARNING, DOI DOI 10.1007/978-1-4757-3264-1
[2]  
Aronson AR, 2001, J AM MED INFORM ASSN, P17
[3]   A simple algorithm for identifying negated findings and diseases in discharge summaries [J].
Chapman, WW ;
Bridewell, W ;
Hanbury, P ;
Cooper, GF ;
Buchanan, BG .
JOURNAL OF BIOMEDICAL INFORMATICS, 2001, 34 (05) :301-310
[4]  
Cohen Aaron M, 2006, AMIA Annu Symp Proc, P161
[5]   Five-way smoking status classification using text hot-spot identification and error-correcting output codes [J].
Cohen, Aaron M. .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2008, 15 (01) :32-35
[6]  
COHEN AM, 2005, P 14 ANN TEXT RETR C
[7]   Ensemble methods in machine learning [J].
Dietterich, TG .
MULTIPLE CLASSIFIER SYSTEMS, 2000, 1857 :1-15
[8]  
Dietterich ThomasG., 1995, Journal of Artificial Intelligence Research, P2
[9]  
GHANI R, 2000, P 17 INT C MACH LEAR, P303
[10]  
[Tucholke B.E. Shipboard Scientific Party Shipboard Scientific Party], 2004, PROC OCEAN DRILL INI, V210, P1, DOI DOI 10.1145/1007730.1007733