Learning from unbalanced data: A cascade-based approach for detecting clustered microcalcifications

被引:67
作者
Bria, A. [1 ]
Karssemeijer, N. [2 ]
Tortorella, F. [1 ]
机构
[1] Univ Cassino, Dept Elect & Informat Engn, I-03043 Cassino, FR, Italy
[2] Radboud Univ Nijmegen, Med Ctr, Diagnost Image Anal Grp, NL-6500 HC Nijmegen, Netherlands
关键词
Computer aided detection; Unbalanced data; Clustered microcalcifications; Mammography; COMPUTER-AIDED DETECTION; AUTOMATIC DETECTION; VECTOR MACHINE; CLASSIFICATION; SEGMENTATION;
D O I
10.1016/j.media.2013.10.014
中图分类号
TP18 [人工智能理论];
学科分类号
140502 [人工智能];
摘要
Finding abnormalities in diagnostic images is a difficult task even for expert radiologists because the normal tissue locations largely outnumber those with suspicious signs which may thus be missed or incorrectly interpreted. For the same reason the design of a Computer-Aided Detection (CADe) system is very complex because the large predominance of normal samples in the training data may hamper the ability of the classifier to recognize the abnormalities on the images. In this paper we present a novel approach for computer-aided detection which faces the class imbalance with a cascade of boosting classifiers where each node is trained by a learning algorithm based on ranking instead of classification error. Such approach is used to design a system (CasCADe) for the automated detection of clustered microcalcifications (mu Cs), which is a severely unbalanced classification problem because of the vast majority of image locations where no mu C is present. The proposed approach was evaluated with a dataset of 1599 full-field digital mammograms from 560 cases and compared favorably with the Hologic R2CAD ImageChecker, one of the most widespread commercial CADe systems. In particular, at the same lesion sensitivity of R2CAD (90%) on biopsy proven malignant cases, CasCADe and R2CAD detected 0.13 and 0.21 false positives per image (FPpi), respectively (p-value = 0.09), whereas at the same FPpi of R2CAD (0.21), CasCADe and R2CAD detected 93% and 90% of true lesions respectively (p-value = 0.11) thus showing that CasCADe can compete with high-end CADe commercial systems. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:241 / 252
页数:12
相关论文
共 42 条
[1]
[Anonymous], ADV NEURAL INFORM PR
[2]
[Anonymous], P 18 ANN INT C IEEE
[3]
[Anonymous], 2003, Journal of machine learning research
[4]
[Anonymous], P SPIE MED IMAGING I
[5]
[Anonymous], 2012, IEEE T SYST MAN CY C, DOI DOI 10.1109/TSMCC.2011.2161285
[6]
Strategies for learning in class imbalance problems [J].
Barandela, R ;
Sánchez, JS ;
García, V ;
Rangel, E .
PATTERN RECOGNITION, 2003, 36 (03) :849-851
[7]
Automatic Detection and Segmentation of Lymph Nodes From CT Data [J].
Barbu, Adrian ;
Suehling, Michael ;
Xu, Xun ;
Liu, David ;
Zhou, S. Kevin ;
Comaniciu, Dorin .
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2012, 31 (02) :240-250
[8]
Beutel J., 2000, HDB MED IMAGING PHYS, V1
[9]
The Preponderance of Evidence Supports Computer-aided Detection for Screening Mammography [J].
Birdwell, Robyn L. .
RADIOLOGY, 2009, 253 (01) :9-16
[10]
Bria A, 2012, INT C PATT RECOG, P3439