Application of peptide LC retention time information in a discriminant function for peptide identification by tandem mass spectrometry

被引:125
作者
Strittmatter, EF
Kangas, LJ
Petritis, K
Mottaz, HM
Anderson, GA
Shen, YF
Jacobs, JM
Camp, DG
Smith, RD
机构
[1] Pacific NW Natl Lab, Div Biol Sci, Richland, WA 99352 USA
[2] Pacific NW Natl Lab, Environm & Mol Sci Lab, Richland, WA 99352 USA
关键词
bioinformatics; proteome; algorithm; accurate mass and time tag; multivariate statistics; capillary liquid-chromatography; retention time; FTICR;
D O I
10.1021/pr049965y
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
We describe the application of a peptide retention time reversed phase liquid chromatography (RPLC) prediction model previously reported (Petritis et al. Anal. Chem. 2003, 75, 1039) for improved peptide identification. The model uses peptide sequence information to generate a theoretical (predicted) elution time that can be compared with the observed elution time. Using data from a set of known proteins, the retention time parameter was incorporated into a discriminant function for use with tandem mass spectrometry (MS/MS) data analyzed with the peptide/protein identification program SEQUEST. For singly charged ions, the number of confident identifications increased by 12% when the elution time metric is included compared to when mass spectral data is the sole source of information in the context of a Drosophila melanogaster database. A 3-4% improvement was obtained for doubly and triply charged ions for the same biological system. Application to the larger Rattus norvegicus (rat) and human proteome databases resulted in an 8-9% overall increase in the number of confident identifications, when both the discriminant function and elution time are used. The effect of adding "runner-up" hits (peptide matches that are not the highest scoring for a spectra) from SEQUEST is also explored, and we find that the number of confident identifications is further increased by 1% when these hits are also considered. Finally, application of the discriminant functions derived in this work with similar to2.2 million spectra from over three hundred LC-MS/MS analyses of peptides from human plasma protein resulted in a 16% increase in confident peptide identifications (9022 vs 7779) using elution time information. Further improvements from the use of elution time information can be expected as both the experimental control of elution time reproducibility and the predictive capability are improved.
引用
收藏
页码:760 / 769
页数:10
相关论文
共 43 条
[11]   A method for reducing the time required to match protein sequences with tandem mass spectra [J].
Craig, R ;
Beavis, RC .
RAPID COMMUNICATIONS IN MASS SPECTROMETRY, 2003, 17 (20) :2310-2316
[12]   AN APPROACH TO CORRELATE TANDEM MASS-SPECTRAL DATA OF PEPTIDES WITH AMINO-ACID-SEQUENCES IN A PROTEIN DATABASE [J].
ENG, JK ;
MCCORMACK, AL ;
YATES, JR .
JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 1994, 5 (11) :976-989
[13]   A proteomic view of the Plasmodium falciparum life cycle [J].
Florens, L ;
Washburn, MP ;
Raine, JD ;
Anthony, RM ;
Grainger, M ;
Haynes, JD ;
Moch, JK ;
Muster, N ;
Sacci, JB ;
Tabb, DL ;
Witney, AA ;
Wolters, D ;
Wu, YM ;
Gardner, MJ ;
Holder, AA ;
Sinden, RE ;
Yates, JR ;
Carucci, DJ .
NATURE, 2002, 419 (6906) :520-526
[14]   Disease proteomics [J].
Hanash, S .
NATURE, 2003, 422 (6928) :226-232
[15]  
JACOBS JM, 2003, J PROTEOME RES
[16]   Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search [J].
Keller, A ;
Nesvizhskii, AI ;
Kolker, E ;
Aebersold, R .
ANALYTICAL CHEMISTRY, 2002, 74 (20) :5383-5392
[17]  
Kislinger T, 2003, CURR OPIN MOL THER, V5, P285
[18]  
Krzanowski W. J., 1988, PRINCIPLES MULTIVARI, P563
[19]   Identifying the major proteome components of Haemophilus influenzae type-strain NCTC 8143 [J].
Link, AJ ;
Hays, LG ;
Carmack, EB ;
Yates, JR .
ELECTROPHORESIS, 1997, 18 (08) :1314-1334
[20]   Global analysis of the Deinococcus radiodurans proteome by using accurate mass tags [J].
Lipton, MS ;
Pasa-Tolic, L ;
Anderson, GA ;
Anderson, DJ ;
Auberry, DL ;
Battista, KR ;
Daly, MJ ;
Fredrickson, J ;
Hixson, KK ;
Kostandarithes, H ;
Masselon, C ;
Markillie, LM ;
Moore, RJ ;
Romine, MF ;
Shen, YF ;
Stritmatter, E ;
Tolic, N ;
Udseth, HR ;
Venkateswaran, A ;
Wong, LK ;
Zhao, R ;
Smith, RD .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (17) :11049-11054