Analysis of Underlying Causes of Inter-expert Disagreement in Retinopathy of Prematurity Diagnosis Application of Machine Learning Principles

被引:23
作者
Ataer-Cansizoglu, E. [1 ]
Kalpathy-Cramer, J. [2 ]
You, S. [1 ]
Keck, K. [3 ]
Erdogmus, D. [1 ]
Chiang, M. F. [3 ]
机构
[1] Northeastern Univ, Cognit Syst Lab, Boston, MA 02115 USA
[2] Massachusetts Gen Hosp, Dept Radiol, Athinoula A Martinos Ctr Biomed Imaging, Charlestown, MA USA
[3] Oregon Hlth & Sci Univ, Dept Ophthalmol & Med Informat & Clin Epidemiol, Portland, OR 97201 USA
关键词
Inter-expert disagreement; feature selection; retinopathy of prematurity; kernel density estimation; PLUS DISEASE DIAGNOSIS; MUTUAL INFORMATION; IMAGE-ANALYSIS; AGREEMENT;
D O I
10.3414/ME13-01-0081
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective: Inter-expert variability in image-based clinical diagnosis has been demonstrated in many diseases including retinopathy of prematurity (ROP), which is a disease affecting low birth weight infants and is a major cause of childhood blindness. In order to better understand the underlying causes of variability among experts, we propose a method to quantify the variability of expert decisions and analyze the relationship between expert diagnoses and features computed from the images. Identification of these features is relevant for development of computer-based decision support systems and educational systems in ROP, and these methods may be applicable to other diseases where inter-expert variability is observed. Methods: The experiments were carried out on a dataset of 34 retinal images, each with diagnoses provided independently by 22 experts. Analysis was performed using concepts of Mutual Information (MI) and Kernel Density Estimation. A large set of structural features (a total of 66) were extracted from retinal images. Feature selection was utilized to identify the most important features that correlated to actual clinical decisions by the 22 study experts. The best three features for each observer were selected by an exhaustive search on all possible feature subsets and considering joint MI as a relevance criterion. We also compared our results with the results of Cohen's Kappa [36] as an inter-rater reliability measure. Results: The results demonstrate that a group of observers (17 among 22) decide consistently with each other. Mean and second central moment of arteriolar tortuosity is among the reasons of disagreement between this group and the rest of the observers, meaning that the group of experts consider amount of tortuosity as well as the variation of tortuosity in the image. Conclusion: Given a set of image-based features, the proposed analysis method can identify critical image-based features that lead to expert agreement and disagreement in diagnosis of ROP. Although tree-based features and various statistics such as central moment are not popular in the literature, our results suggest that they are important for diagnosis.
引用
收藏
页码:93 / 102
页数:10
相关论文
共 37 条
[1]  
[Anonymous], 1973, Cartographica: The International Journal for Geographic Information and Geovisualization, DOI 10.3138/FM57-6770-U75U-7727
[2]  
[Anonymous], 2002, P NIPS
[3]  
Ataer-Cansizoglu E., 2012, IEEE INT WORKSH MACH, P1
[4]   Interexpert agreement of plus disease diagnosis in retinopathy of prematurity [J].
Chiang, Michael F. ;
Jiang, Lei ;
Gelman, Rony ;
Du, Yunling E. ;
Flynn, John T. .
ARCHIVES OF OPHTHALMOLOGY, 2007, 125 (07) :875-880
[5]   A COEFFICIENT OF AGREEMENT FOR NOMINAL SCALES [J].
COHEN, J .
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1960, 20 (01) :37-46
[6]  
de Boor C, 2001, PRACTICAL GUIDE SPLI
[7]   A BIBLIOGRAPHY OF PUBLICATIONS ON OBSERVER VARIABILITY (FINAL INSTALLMENT) [J].
ELMORE, JG ;
FEINSTEIN, AR .
JOURNAL OF CLINICAL EPIDEMIOLOGY, 1992, 45 (06) :567-580
[8]   The more eyes, the better to see? From double to quadruple reading of screening mammograms [J].
Elmore, Joann G. ;
Brenner, R. James .
JOURNAL OF THE NATIONAL CANCER INSTITUTE, 2007, 99 (15) :1141-1143
[9]   USE OF THE AVERAGE MUTUAL INFORMATION INDEX IN EVALUATING CLASSIFICATION ERROR AND CONSISTENCY [J].
FINN, JT .
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SYSTEMS, 1993, 7 (04) :349-366
[10]  
GARNER A, 1984, ARCH OPHTHALMOL-CHIC, V102, P1130