Bioinformatics in proteomics: application, terminology, and pitfalls

被引:13
作者
Wiemer, JC [1 ]
Prokudin, A [1 ]
机构
[1] Europroteome AG, D-16761 Hennigsdorf, Germany
关键词
decision trees; bagging; mass spectrometry;
D O I
10.1016/j.prp.2004.01.012
中图分类号
R36 [病理学];
学科分类号
100104 ;
摘要
Bioinformatics applies data mining, i.e., modern computer-based statistics, to biomedical data. It leverages on machine learning approaches, such as artificial neural networks, decision trees and clustering algorithms, and is ideally suited for handling huge data amounts. In this article, we review the analysis of mass spectrometry data in proteomics, starting with common pre-processing steps and using single decision trees and decision tree ensembles for classification. Special emphasis is put on the pitfall of overfitting, i.e., of generating too complex single decision trees. Finally, we discuss the pros and cons of the two different decision tree usages. (C) 2004 Elsevier GmbH. All rights reserved.
引用
收藏
页码:173 / 178
页数:6
相关论文
共 17 条
[1]  
Adam BL, 2002, CANCER RES, V62, P3609
[2]  
Bishop C. M., 1996, Neural networks for pattern recognition
[3]   SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]  
Campa MJ, 2003, PROTEOMICS, V3, P1659
[7]  
EBERT MPA, 2004, UNPUB IDENTIFICATION
[8]  
Friedman J., 2001, ELEMENTS STAT LEARNI, V1
[9]   Megavariate data analysis of mass spectrometric proteomics data using latent variable projection method [J].
Lee, KR ;
Lin, XW ;
Park, DC ;
Eslava, S .
PROTEOMICS, 2003, 3 (09) :1680-1686
[10]  
Li JN, 2002, CLIN CHEM, V48, P1296