Megavariate data analysis of mass spectrometric proteomics data using latent variable projection method

被引:62
作者
Lee, KR [1 ]
Lin, XW [1 ]
Park, DC [1 ]
Eslava, S [1 ]
机构
[1] GlaxoSmithKline Pharmaceut, Data Explorat Sci, Collegeville, PA 19426 USA
关键词
latent variable projection method; mass spectrometry; megavariate data analysis; partial least squares; principal component analysis;
D O I
10.1002/pmic.200300515
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
There are many data mining techniques for processing and general learning of multivariate data. However, we believe the wavelet transformation and latent variable projection method are particularly useful for spectroscopic and chromatographic data. Projection based methods are designed to handle hugely multivariate nature of such data effectively. For the actual analysis of the data we have used latent variable projection methods such as principal component analysis (PCA) and partial least squares projection to latent structures based discriminant analysis (PLS-DA) to analyze the raw data presented to the participants of the First Duke Proteomics Data Mining Conference. PCA was used to solve problem #1 (clustering problem) and the PLS-DA was used to solve problem #2 (classification problem). The idea of internal and external cross-validation was used to validate the model obtained from the classification analysis. The simple two-component PLS-DA model obtained from the analysis performed well. The model has completely separated the two groups from all the data. The same model applied on two-thirds of the data showed good performance by external validation with independent test set of remaining 13 specimens obtained by setting aside the spectra of every third specimen (accuracy of 85%).
引用
收藏
页码:1680 / 1686
页数:7
相关论文
共 13 条
[1]   An introduction to wavelet transforms for chemometricians: A time-frequency approach [J].
Alsberg, BK ;
Woodward, AM ;
Kell, DB .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1997, 37 (02) :215-239
[2]   INTEGRATION OF MASS-SPECTROMETRY IN ANALYTICAL BIOTECHNOLOGY [J].
CARR, SA ;
HEMLING, ME ;
BEAN, MF ;
ROBERTS, GD .
ANALYTICAL CHEMISTRY, 1991, 63 (24) :2802-2824
[3]  
Hastie T., 2001, ELEMENTS STAT LEARNI, P347, DOI DOI 10.1007/978-0-387-21606-5_11
[4]  
Jackson JE, 1991, A user's guide to principal components
[5]   A COMPARISON OF THE HEURISTIC EVOLVING LATENT PROJECTIONS AND EVOLVING FACTOR-ANALYSIS METHODS FOR PEAK PURITY CONTROL IN LIQUID-CHROMATOGRAPHY WITH PHOTODIODE ARRAY DETECTION [J].
KELLER, HR ;
MASSART, DL ;
LIANG, YZ ;
KVALHEIM, OM .
ANALYTICA CHIMICA ACTA, 1992, 267 (01) :63-71
[6]  
KRISTY AA, 1992, VIB SPECTROSC, V6, P1
[7]   HEURISTIC EVOLVING LATENT PROJECTIONS - RESOLVING 2-WAY MULTICOMPONENT DATA .1. SELECTIVITY, LATENT-PROJECTIVE GRAPH, DATASCOPE, LOCAL RANK, AND UNIQUE RESOLUTION [J].
KVALHEIM, OM ;
LIANG, YZ .
ANALYTICAL CHEMISTRY, 1992, 64 (08) :936-946
[8]   HEURISTIC EVOLVING LATENT PROJECTIONS - RESOLVING 2-WAY MULTICOMPONENT DATA .2. DETECTION AND RESOLUTION OF MINOR CONSTITUENTS [J].
LIANG, YZ ;
KVALHEIM, OM ;
KELLER, HR ;
MASSART, DL ;
KIECHLE, P ;
ERNI, F .
ANALYTICAL CHEMISTRY, 1992, 64 (08) :946-953
[9]   ANALYSIS OF NONTRANSPARENT POLYMERS - MIXTURE DESIGN, 2ND-DERIVATIVE ATTENUATED TOTAL INTERNAL REFLECTANCE FT-IR, AND MULTIVARIATE CALIBRATION [J].
TOFT, J ;
KVALHEIM, OM ;
KARSTANG, TV ;
CHRISTY, AA ;
KLEVELAND, K ;
HENRIKSEN, A .
APPLIED SPECTROSCOPY, 1992, 46 (06) :1002-1008
[10]   PLS regression on wavelet compressed NIR spectra [J].
Trygg, J ;
Wold, S .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1998, 42 (1-2) :209-220