Deep Learning in Mammography Diagnostic Accuracy of a Multipurpose Image Analysis Software in the Detection of Breast Cancer

被引:281
作者
Becker, Anton S. [1 ]
Marcon, Magda [1 ]
Ghafoor, Soleen [1 ]
Wurnig, Moritz C. [1 ]
Frauenfelder, Thomas [1 ]
Boss, Andreas [1 ]
机构
[1] Univ Hosp Zurich, Inst Diagnost & Intervent Radiol, Raemistr 100, CH-8091 Zurich, Switzerland
关键词
mammography; breast cancer; artificial neural network; artificial intelligence; machine learning; deep learning; diagnostic accuracy; BREAST; MODEL;
D O I
10.1097/RLI.0000000000000358
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
100231 [临床病理学]; 100902 [航空航天医学];
摘要
Objectives: The aim of this study was to evaluate the diagnostic accuracy of a multipurpose image analysis software based on deep learning with artificial neural networks for the detection of breast cancer in an independent, dual-center mammography data set. Materials and Methods: In this retrospective, Health Insurance Portability and Accountability Act-compliant study, all patients undergoing mammography in 2012 at our institution were reviewed (n = 3228). All of their prior and follow-up mammographies from a time span of 7 years (2008-2015) were considered as a reference for clinical diagnosis. After applying exclusion criteria (missing reference standard, prior procedures or therapies), patients with the first diagnosis of a malignoma or borderline lesion were selected (n = 143). Histology or clinical long-term follow-up served as reference standard. In a first step, a breast density-and age-matched control cohort was selected (n = 143) from the remaining patients with more than 2 years follow-up (n = 1003). The neural network was trained with this data set. From the publicly available Breast Cancer Digital Repository data set, patients with cancer and a matched control cohort were selected (n = 35 x 2). The performance of the trained neural network was also tested with this external data set. Three radiologists (3, 5, and 10 years of experience) evaluated the test data set. In a second step, the neural network was trained with all cases from January to September and tested with cases from October to December 2012 (screening-like cohort). The radiologists also evaluated this second test data set. The areas under the receiver operating characteristic curve between readers and the neural network were compared. A Bonferroni-corrected P value of less than 0.016 was considered statistically significant. Results: Mean age of patients with lesion was 59.6 years (range, 35-88 years) and in controls, 59.1 years (35-83 years). Breast density distribution (A/B/C/D) was 21/59/42/21 and 22/60/41/20, respectively. Histologic diagnoses were invasive ductal carcinoma in 90, ductal in situ carcinoma in 13, invasive lobular carcinoma in 13, mucinous carcinoma in 3, and borderline lesion in 12 patients. In the first step, the area under the receiver operating characteristic curve of the trained neural network was 0.81 and comparable on the test cases 0.79 (P - 0.63). One of the radiologists showed almost equal performance (0.83, P - 0.17), whereas 2 were significantly better (0.91 and 0.94, P < 0.016). In the second step, performance of the neural network (0.82) was not significantly different from the human performance (0.77-0.87, P > 0.016); however, radiologists were consistently less sensitive and more specific than the neural network. Conclusions: Current state-of-the-art artificial neural networks for general image analysis are able to detect cancer in mammographies with similar accuracy to radiologists, even in a screening-like cohort with low breast cancer prevalence.
引用
收藏
页码:434 / 440
页数:7
相关论文
共 31 条
[1]
[Anonymous], 2012, SPACING DIAERESIS EF, DOI DOI 10.1007/978-3-642-35289-8
[2]
Learning Deep Architectures for AI [J].
Bengio, Yoshua .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01) :1-127
[3]
Long-Term Psychosocial Consequences of False-Positive Screening Mammography [J].
Brodersen, John ;
Siersma, Volkert Dirk .
ANNALS OF FAMILY MEDICINE, 2013, 11 (02) :106-115
[4]
Cardoso Moura Daniel, 2013, Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. 18th Iberoamerican Congress, CIARP 2013. Proceedings, LNCS 8258, P326, DOI 10.1007/978-3-642-41822-8_41
[5]
Towards localization of malignant sites of asymmetry across bilateral mammograms [J].
Casti, P. ;
Mencattini, A. ;
Salmeri, M. ;
Ancona, A. ;
Lorusso, M. ;
Pepe, M. L. ;
Natale, C. Di ;
Martinelli, E. .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2017, 140 :11-18
[6]
CICCHETTI DV, 1981, AM J MENT DEF, V86, P127
[7]
COMPARING THE AREAS UNDER 2 OR MORE CORRELATED RECEIVER OPERATING CHARACTERISTIC CURVES - A NONPARAMETRIC APPROACH [J].
DELONG, ER ;
DELONG, DM ;
CLARKEPEARSON, DI .
BIOMETRICS, 1988, 44 (03) :837-845
[8]
Dhungel N, 2015, 2015 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), P160
[9]
Ertosun MG, 2015, IEEE INT C BIOINFORM, P1310, DOI 10.1109/BIBM.2015.7359868
[10]
A new method of detecting micro-calcification clusters in mammograms using contourlet transform and non-linking simplified PCNN [J].
Guo, Ya'nan ;
Dong, Min ;
Yang, Zhen ;
Gao, Xiaoli ;
Wang, Keju ;
Luo, Chongfan ;
Ma, Yide ;
Zhang, Jiuwen .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2016, 130 :31-45