Evaluation of a Deep Learning System For Identifying Glaucomatous Optic Neuropathy Based on Color Fundus Photographs

被引:37
作者
Al-Aswad, Lama A. [1 ]
Kapoor, Rahul [1 ]
Chu, Chia Kai [1 ]
Walters, Stephen [1 ]
Gong, Dan [1 ]
Garg, Aakriti [1 ]
Gopal, Kalashree [1 ]
Patel, Vipul [1 ]
Sameer, Trikha [2 ,3 ]
Rogers, Thomas W. [2 ]
Nicolas, Jaccard [2 ]
De Moraes, Gustavo C. [1 ]
Moazami, Golnaz [1 ]
机构
[1] Columbia Univ, Harkness Eye Inst, Med Ctr, New York, NY USA
[2] Visulytix Ltd, London, England
[3] Kings Coll Hosp NHS Fdn Trust, London, England
关键词
Artificial Intelligence (AI); deep learning; optic neuropathy; online retinal fundus image database for glaucoma analysis and research (ORIGA); Singapore Malay Eye Study (SiMES); SINGAPORE MALAY EYE;
D O I
10.1097/IJG.0000000000001319
中图分类号
R77 [眼科学];
学科分类号
100212 [眼科学];
摘要
Precis: Pegasus outperformed 5 of the 6 ophthalmologists in terms of diagnostic performance, and there was no statistically significant difference between the deep learning system and the "best case" consensus between the ophthalmologists. The agreement between Pegasus and gold standard was 0.715, whereas the highest ophthalmologist agreement with the gold standard was 0.613. Furthermore, the high sensitivity of Pegasus makes it a valuable tool for screening patients with glaucomatous optic neuropathy. Purpose: The purpose of this study was to evaluate the performance of a deep learning system for the identification of glaucomatous optic neuropathy. Materials and Methods: Six ophthalmologists and the deep learning system, Pegasus, graded 110 color fundus photographs in this retrospective single-center study. Patient images were randomly sampled from the Singapore Malay Eye Study. Ophthalmologists and Pegasus were compared with each other and to the original clinical diagnosis given by the Singapore Malay Eye Study, which was defined as the gold standard. Pegasus' performance was compared with the "best case" consensus scenario, which was the combination of ophthalmologists whose consensus opinion most closely matched the gold standard. The performance of the ophthalmologists and Pegasus, at the binary classification of nonglaucoma versus glaucoma from fundus photographs, was assessed in terms of sensitivity, specificity and the area under the receiver operating characteristic curve (AUROC), and the intraobserver and interobserver agreements were determined. Results: Pegasus achieved an AUROC of 92.6% compared with ophthalmologist AUROCs that ranged from 69.6% to 84.9% and the "best case" consensus scenario AUROC of 89.1%. Pegasus had a sensitivity of 83.7% and a specificity of 88.2%, whereas the ophthalmologists' sensitivity ranged from 61.3% to 81.6% and specificity ranged from 80.0% to 94.1%. The agreement between Pegasus and gold standard was 0.715, whereas the highest ophthalmologist agreement with the gold standard was 0.613. Intraobserver agreement ranged from 0.62 to 0.97 for ophthalmologists and was perfect (1.00) for Pegasus. The deep learning system took similar to 10% of the time of the ophthalmologists in determining classification. Conclusions: Pegasus outperformed 5 of the 6 ophthalmologists in terms of diagnostic performance, and there was no statistically significant difference between the deep learning system and the "best case" consensus between the ophthalmologists. The high sensitivity of Pegasus makes it a valuable tool for screening patients with glaucomatous optic neuropathy. Future work will extend this study to a larger sample of patients.
引用
收藏
页码:1029 / 1034
页数:6
相关论文
共 20 条
[1]
Diagnostic Discrepancies in Retinopathy of Prematurity Classification [J].
Campbell, J. Peter ;
Ryan, Michael C. ;
Lore, Emily ;
Tian, Peng ;
Ostmo, Susan ;
Jonas, Karyn ;
Chan, R. V. Paul ;
Chiang, Michael F. .
OPHTHALMOLOGY, 2016, 123 (08) :1795-1801
[2]
Performance of Deep Learning Architectures and Transfer Learning for Detecting Glaucomatous Optic Neuropathy in Fundus Photographs [J].
Christopher, Mark ;
Beighith, Akram ;
Bowd, Christopher ;
Proudfoot, James A. ;
Goldbaum, Michael H. ;
Weinreb, Robert N. ;
Girkin, Christopher A. ;
Liebmann, Jeffrey M. ;
Zangwill, Linda M. .
SCIENTIFIC REPORTS, 2018, 8
[3]
Deeplearning4j, WHATS DIFF ART INT A
[4]
COMPARING THE AREAS UNDER 2 OR MORE CORRELATED RECEIVER OPERATING CHARACTERISTIC CURVES - A NONPARAMETRIC APPROACH [J].
DELONG, ER ;
DELONG, DM ;
CLARKEPEARSON, DI .
BIOMETRICS, 1988, 44 (03) :837-845
[5]
Rationale and methodology for a population-based study of eye diseases in Malay people: The Singapore Malay eye study (SiMES) [J].
Foong, Athena W. P. ;
Saw, Seang-Mei ;
Loo, Jing-Liang ;
Shen, Sunny ;
Loon, Seng-Chee ;
Rosman, Mohamad ;
Aung, Tin ;
Tan, Donald T. H. ;
Tai, E. Shyong ;
Wong, Tien Y. .
OPHTHALMIC EPIDEMIOLOGY, 2007, 14 (01) :25-35
[6]
He K., 2016, CVPR, DOI [10.1109/CVPR.2016.90, DOI 10.1109/CVPR.2016.90]
[7]
Jae S, 2017, PLOS ONE, V2, pe0177726
[8]
The current state of artificial intelligence in ophthalmology [J].
Kapoor, Rahul ;
Walters, Stephen P. ;
Al-Aswad, Lama A. .
SURVEY OF OPHTHALMOLOGY, 2019, 64 (02) :233-240
[9]
A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research [J].
Koo, Terry K. ;
Li, Mae Y. .
JOURNAL OF CHIROPRACTIC MEDICINE, 2016, 15 (02) :155-163
[10]
Efficacy of a Deep Learning System for Detecting Glaucomatous Optic Neuropathy Based on Color Fundus Photographs [J].
Li, Zhixi ;
He, Yifan ;
Keel, Stuart ;
Meng, Wei ;
Chang, Robert T. ;
He, Mingguang .
OPHTHALMOLOGY, 2018, 125 (08) :1199-1206