UNLOCKING CLINICAL-DATA FROM NARRATIVE REPORTS - A STUDY OF NATURAL-LANGUAGE PROCESSING

被引:216
作者
HRIPCSAK, G
FRIEDMAN, C
ALDERSON, PO
DUMOUCHEL, W
JOHNSON, SB
CLAYTON, PD
机构
[1] CUNY QUEENS COLL, DEPT COMP SCI, FLUSHING, NY 11367 USA
[2] COLUMBIA PRESBYTERIAN MED CTR, DEPT RADIOL, NEW YORK, NY 10032 USA
关键词
D O I
10.7326/0003-4819-122-9-199505010-00007
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Objective: To evaluate the automated detection of clinical conditions described in narrative reports. Design: Automated methods and human experts detected the presence or absence of six clinical conditions in 200 admission chest radiograph reports. Study Subjects: A computerized, general-purpose natural language processor; 6 internists; 6 radiologists; 6 lay persons; and 3 other computer methods. Main Outcome Measures: Intersubject disagreement was quantified by ''distance'' (the average number of clinical conditions per report on which two subjects disagreed) and by sensitivity and specificity with respect to the physicians. Results: Using a majority vote, physicians detected 101 conditions in the 200 reports (0.51 per report); the most common condition was acute bacterial pneumonia (prevalence, 0.14), and the least common was chronic obstructive pulmonary disease (prevalence, 0.03). Pairs of physicians disagreed on the presence of at least 1 condition for an average of 20% of reports. The average intersubject distance among physicians was 0.24 (95% CI, 0.19 to 0.29) out of a maximum possible distance of 6. No physician had a significantly greater distance than the average. The average distance of the natural language processor from the physicians was 0.26 (CI, 0.21 to 0.32; not significantly greater than the average among physicians). Lay persons and alternative computer methods had significantly greater distance from the physicians (all >0.5). The natural language processor had a sensitivity of 81% (CI, 73% to 87%) and a specificity of 98% (CI, 97% to 99%); physicians had an average sensitivity of 85% and an average specificity of 98%. Conclusions: Physicians disagreed on the interpretation of narrative reports, but this was not caused by outlier physicians or a consistent difference in the way internists and radiologists read reports. The natural language processor was not distinguishable from the physicians and was superior to all other comparison subjects. Although the domain of this study was restricted (six clinical conditions in chest radiographs), natural language processing seems to have the potential to extract clinical information from narrative reports in a manner that will support automated decision-support and clinical research.
引用
收藏
页码:681 / 688
页数:8
相关论文
共 20 条
[1]  
CHINCHOR N, 1993, COMPUTATIONAL LINGUI, V19, P409
[2]  
Dunn G., 1989, DESIGN ANAL RELIABIL
[3]   A GENERAL NATURAL-LANGUAGE TEXT PROCESSOR FOR CLINICAL RADIOLOGY [J].
FRIEDMAN, C ;
ALDERSON, PO ;
AUSTIN, JHM ;
CIMINO, JJ ;
JOHNSON, SB .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 1994, 1 (02) :161-174
[4]  
GABRIELI E R, 1988, Journal of Medical Systems, V12, P135, DOI 10.1007/BF00996636
[5]   COMPUTERIZED EXTRACTION OF CODED FINDINGS FROM FREE-TEXT RADIOLOGIC REPORTS - WORK IN PROGRESS [J].
HAUG, PJ ;
RANUM, DL ;
FREDERICK, PR .
RADIOLOGY, 1990, 174 (02) :543-548
[6]  
Hripcsak G., 1991, Software Engineering in Medical Informatics. Proceedings of the IMIA Working Conference, P471
[7]   RATIONALE FOR THE ARDEN-SYNTAX [J].
HRIPCSAK, G ;
LUDEMANN, P ;
PRYOR, TA ;
WIGERTZ, OB ;
CLAYTON, PD .
COMPUTERS AND BIOMEDICAL RESEARCH, 1994, 27 (04) :291-324
[8]   EFFECTS OF COMPUTER-BASED CLINICAL DECISION-SUPPORT SYSTEMS ON CLINICIAN PERFORMANCE AND PATIENT OUTCOME - A CRITICAL-APPRAISAL OF RESEARCH [J].
JOHNSTON, ME ;
LANGTON, KB ;
HAYNES, RB ;
MATHIEU, A .
ANNALS OF INTERNAL MEDICINE, 1994, 120 (02) :135-142
[9]   DISCORDANCE OF DATABASES DESIGNED FOR CLAIMS PAYMENT VERSUS CLINICAL INFORMATION-SYSTEMS - IMPLICATIONS FOR OUTCOMES RESEARCH [J].
JOLLIS, JG ;
ANCUKIEWICZ, M ;
DELONG, ER ;
PRYOR, DB ;
MUHLBAIER, LH ;
MARK, DB .
ANNALS OF INTERNAL MEDICINE, 1993, 119 (08) :844-850
[10]  
MCDONALD CJ, 1992, M D COMPUT, V9, P206