Automated Classification of Circulating Tumor Cells and the Impact of Interobsever Variability on Classifier Training and Performance

被引:36
作者
Svensson, Carl-Magnus [1 ]
Huebler, Ron [1 ,2 ]
Figge, Marc Thilo [1 ,2 ]
机构
[1] HKI, Leibniz Inst Nat Prod Res & Infect Biol, Appl Syst Biol, D-07745 Jena, Germany
[2] Univ Jena, D-07743 Jena, Germany
关键词
PROGRESSION-FREE; CANCER; BLOOD; EXPRESSION; DIAGNOSIS; NETWORKS; SURVIVAL; BIOLOGY;
D O I
10.1155/2015/573165
中图分类号
R392 [医学免疫学]; Q939.91 [免疫学];
学科分类号
100102 ;
摘要
Application of personalized medicine requires integration of different data to determine each patient's unique clinical constitution. The automated analysis of medical data is a growing field where different machine learning techniques are used to minimize the time-consuming task of manual analysis. The evaluation, and often training, of automated classifiers requires manually labelled data as ground truth. In many cases such labelling is not perfect, either because of the data being ambiguous even for a trained expert or because of mistakes. Here we investigated the interobserver variability of image data comprising fluorescently stained circulating tumor cells and its effect on the performance of two automated classifiers, a random forest and a support vector machine. We found that uncertainty in annotation between observers limited the performance of the automated classifiers, especially when it was included in the test set on which classifier performance was measured. The random forest classifier turned out to be resilient to uncertainty in the training data while the support vector machine's performance is highly dependent on the amount of uncertainty in the training data. We finally introduced the consensus data set as a possible solution for evaluation of automated classifiers that minimizes the penalty of interobserver variability.
引用
收藏
页数:9
相关论文
共 39 条
[1]   OPINION Challenges in circulating tumour cell research [J].
Alix-Panabieres, Catherine ;
Pantel, Klaus .
NATURE REVIEWS CANCER, 2014, 14 (09) :623-631
[2]   Epithelial Cell Adhesion Molecule-positive Circulating Tumor Cells as Predictive Biomarker in Patients With Prostate Cancer [J].
Amato, Robert J. ;
Melnikova, Vladislava ;
Zhang, Yujian ;
Liu, Wen ;
Saxena, Somyata ;
Shah, Parth K. ;
Jensen, Brett T. ;
Torres, Karen E. ;
Davis, Darren W. .
UROLOGY, 2013, 81 (06) :1303-1307
[3]   The biology of the 17-1A antigen (Ep-CAM) [J].
Balzar, M ;
Winter, MJ ;
de Boer, CJ ;
Litvinov, SV .
JOURNAL OF MOLECULAR MEDICINE-JMM, 1999, 77 (10) :699-712
[4]   Pinched flow coupled shear-modulated inertial microfluidics for high-throughput rare blood cell separation [J].
Bhagat, Ali Asgar S. ;
Hou, Han Wei ;
Li, Leon D. ;
Lim, Chwee Teck ;
Han, Jongyoon .
LAB ON A CHIP, 2011, 11 (11) :1870-1878
[5]   Citizen Science: A Developing Tool for Expanding Science Knowledge and Scientific Literacy [J].
Bonney, Rick ;
Cooper, Caren B. ;
Dickinson, Janis ;
Kelling, Steve ;
Phillips, Tina ;
Rosenberg, Kenneth V. ;
Shirk, Jennifer .
BIOSCIENCE, 2009, 59 (11) :977-984
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]   Circulating tumor cells versus imaging - Predicting overall survival in metastatic breast cancer [J].
Budd, G. Thomas ;
Cristofanilli, Massimo ;
Ellis, Mathew J. ;
Stopeck, Allison ;
Borden, Ernest ;
Miller, M. Craig ;
Matera, Jeri ;
Repollet, Madeline ;
Doyle, Gerald V. ;
Terstappen, Leon W. M. M. ;
Hayes, Daniel F. .
CLINICAL CANCER RESEARCH, 2006, 12 (21) :6403-6409
[8]  
Buhmann Martin D, 2003, C MO AP C M, V12, DOI 10.1017/CBO9780511543241
[9]  
Chausovsky G, 1999, CANCER, V86, P2398, DOI 10.1002/(SICI)1097-0142(19991201)86:11<2398::AID-CNCR30>3.0.CO
[10]  
2-5