Proteomic mass spectra classification using decision tree based ensemble methods

被引:101
作者
Geurts, P [1 ]
Fillet, M
de Seny, D
Meuwis, MA
Malaise, M
Merville, MP
Wehenkel, L
机构
[1] Univ Liege, Dept Elect Engn & Comp Sci, B-4000 Liege, Belgium
[2] Univ Liege, Lab Clin Chem & Rheumatol, CBIG, B-4000 Liege, Belgium
关键词
D O I
10.1093/bioinformatics/bti494
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Modern mass spectrometry allows the determination of proteomic fingerprints of body fluids like serum, saliva or urine. These measurements can be used in many medical applications in order to diagnose the current state or predict the evolution of a disease. Recent developments in machine learning allow one to exploit such datasets, characterized by small numbers of very high-dimensional samples. Results: We propose a systematic approach based on decision tree ensemble methods, which is used to automatically determine proteomic biomarkers and predictive models. The approach is validated on two datasets of surface-enhanced laser desorption/ionization time of flight measurements, for the diagnosis of rheumatoid arthritis and inflammatory bowel diseases. The results suggest that the methodology can handle a broad class of similar problems.
引用
收藏
页码:3138 / 3145
页数:8
相关论文
共 19 条
[1]   Selection bias in gene extraction on the basis of microarray gene-expression data [J].
Ambroise, C ;
McLachlan, GJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (10) :6562-6566
[2]   An empirical comparison of voting classification algorithms: Bagging, boosting, and variants [J].
Bauer, E ;
Kohavi, R .
MACHINE LEARNING, 1999, 36 (1-2) :105-139
[3]   SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]   An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization [J].
Dietterich, TG .
MACHINE LEARNING, 2000, 40 (02) :139-157
[7]  
Freund Y., 1995, COMPUTATIONAL LEARNI, V904, P23, DOI DOI 10.1007/3-540-59119-2_166
[8]  
Friedman J., 2001, ELEMENTS STAT LEARNI, V1
[9]  
FUNG ET, 2002, COMPUTATIONAL PROTEO, V3, pS34
[10]   Application of the random forest classification algorithm to a SELDI-TOF proteomics study in the setting of a cancer prevention trial [J].
Izmirlian, G .
APPLICATIONS OF BIOINFORMATICS IN CANCER DETECTION, 2004, 1020 :154-174