Identification of biomarkers from mass spectrometry data using a "common" peak approach

被引:28
作者
Fushiki, Tadayoshi [1 ]
Fujisawa, Hironori [1 ]
Eguchi, Shinto [1 ]
机构
[1] Inst Stat Math, Dept Math Anal & Stat Inference, Tokyo 106, Japan
关键词
D O I
10.1186/1471-2105-7-358
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Proteomic data obtained from mass spectrometry have attracted great interest for the detection of early-stage cancer. However, as mass spectrometry data are high-dimensional, identification of biomarkers is a key problem. Results: This paper proposes the use of "common" peaks in data as biomarkers. Analysis is conducted as follows: data preprocessing, identification of biomarkers, and application of AdaBoost to construct a classification function. Informative "common" peaks are selected by AdaBoost. AsymBoost is also examined to balance false negatives and false positives. The effectiveness of the approach is demonstrated using an ovarian cancer dataset. Conclusion: Continuous covariates and discrete covariates can be used in the present approach. The difference between the result for the continuous covariates and that for the discrete covariates was investigated in detail. In the example considered here, both covariates provide a good prediction, but it seems that they provide different kinds of information. We can obtain more information on the structure of the data by integrating both results.
引用
收藏
页数:9
相关论文
共 13 条
[1]   A decision-theoretic generalization of on-line learning and an application to boosting [J].
Freund, Y ;
Schapire, RE .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) :119-139
[2]  
Friedman J., 2001, The elements of statistical learning, V1, DOI DOI 10.1007/978-0-387-21606-5
[3]   Proteomic mass spectra classification using decision tree based ensemble methods [J].
Geurts, P ;
Fillet, M ;
de Seny, D ;
Meuwis, MA ;
Malaise, M ;
Merville, MP ;
Wehenkel, L .
BIOINFORMATICS, 2005, 21 (14) :3138-3145
[4]   Disease proteomics [J].
Hanash, S .
NATURE, 2003, 422 (6928) :226-232
[5]   Algorithms for alignment of mass spectrometry proteomic data [J].
Jeffries, N .
BIOINFORMATICS, 2005, 21 (14) :3066-3073
[6]   Use of proteomic patterns in serum to identify ovarian cancer [J].
Petricoin, EF ;
Ardekani, AM ;
Hitt, BA ;
Levine, PJ ;
Fusaro, VA ;
Steinberg, SM ;
Mills, GB ;
Simone, C ;
Fishman, DA ;
Kohn, EC ;
Liotta, LA .
LANCET, 2002, 359 (9306) :572-577
[7]  
TAKENOUCHI T, 2005, IEEE 5 S BIOINF BIOE, P218
[8]   Sample classification from protein mass spectrometry, by 'peak probability contrasts' [J].
Tibshirani, R ;
Hastie, T ;
Narasimhan, B ;
Soltys, S ;
Shi, GY ;
Koong, A ;
Le, QT .
BIOINFORMATICS, 2004, 20 (17) :3034-3044
[9]  
VIOLA P, NEURAL INFORM PROCES, V14
[10]   SpecAlign - processing and alignment of mass spectra datasets [J].
Wong, JWH ;
Cagney, G ;
Cartwright, HM .
BIOINFORMATICS, 2005, 21 (09) :2088-2090