Probability model for assessing proteins assembled from peptide sequences inferred from tandem mass spectrometry data

被引:55
作者
Feng, Jian
Naiman, Daniel Q.
Cooper, Bret
机构
[1] ARS, Soybean Genom & Improvement Lab, USDA, Beltsville, MD 20705 USA
[2] Johns Hopkins Univ, Dept Appl Math & Stat, Baltimore, MD 21218 USA
关键词
D O I
10.1021/ac070202e
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
In shotgun proteomics, tandem mass spectrometry is used to identify peptides derived from proteins. After the peptides are detected, proteins are reassembled via a reference database of protein or gene information. Redundancy and homology between protein records in databases make it challenging to assign peptides to proteins that may or may not be in an experimental sample. Here, a probability model is introduced for determining the likelihood that peptides are correctly assigned to proteins. This model derives consistent probability estimates for assembled proteins. The probability scores make it easier to confidently identify proteins in complex samples and to accurately estimate false-positive rates. The algorithm based on this model is robust in creating protein complements from peptides from bovine protein standards, yeast, Ustilago maydis cell lysates, and Arabidopsis thaliana leaves. It also eliminates the side effects of redundancy and homology from the reference databases by employing a new concept of peptide grouping and by coherently distinguishing distinct peptides from unique records and shared peptides from homologous proteins. The software that runs the algorithm, called PANORAMICS, provides a tool to help analyze the data based on a researcher's knowledge about the sample. The software operates efficiently and quickly compared to other software platforms.
引用
收藏
页码:3901 / 3911
页数:11
相关论文
共 36 条
[1]   Potential for false positive identifications from large databases through tandem mass spectrometry [J].
Cargile, BJ ;
Bundy, JL ;
Stephenson, JL .
JOURNAL OF PROTEOME RESEARCH, 2004, 3 (05) :1082-1085
[2]   The need for guidelines in publication of peptide and protein identification data - Working group on publication guidelines for peptide and protein identification data [J].
Carr, S ;
Aebersold, R ;
Baldwin, M ;
Burlingame, A ;
Clauser, K ;
Nesvizhskii, A .
MOLECULAR & CELLULAR PROTEOMICS, 2004, 3 (06) :531-533
[3]   Evaluation of algorithms for protein identification from sequence databases using mass spectrometry data [J].
Chamrad, DC ;
Körting, G ;
Stühler, K ;
Meyer, HE ;
Klose, J ;
Blüggel, M .
PROTEOMICS, 2004, 4 (03) :619-628
[4]   Shotgun identification of proteins from uredospores of the bean rust Uromyces appendiculatus [J].
Cooper, B ;
Garrett, WM ;
Campbell, KB .
PROTEOMICS, 2006, 6 (08) :2477-2484
[5]   Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations [J].
Elias, JE ;
Haas, W ;
Faherty, BK ;
Gygi, SP .
NATURE METHODS, 2005, 2 (09) :667-675
[6]   AN APPROACH TO CORRELATE TANDEM MASS-SPECTRAL DATA OF PEPTIDES WITH AMINO-ACID-SEQUENCES IN A PROTEIN DATABASE [J].
ENG, JK ;
MCCORMACK, AL ;
YATES, JR .
JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 1994, 5 (11) :976-989
[7]   Multidimensional LC-LC and LC-CE for high-resolution separations of biological molecules [J].
Evans, CR ;
Jorgenson, JW .
ANALYTICAL AND BIOANALYTICAL CHEMISTRY, 2004, 378 (08) :1952-1961
[8]   Open mass spectrometry search algorithm [J].
Geer, LY ;
Markey, SP ;
Kowalak, JA ;
Wagner, L ;
Xu, M ;
Maynard, DM ;
Yang, XY ;
Shi, WY ;
Bryant, SH .
JOURNAL OF PROTEOME RESEARCH, 2004, 3 (05) :958-964
[9]   SEQUENCE-ANALYSIS OF POLYPEPTIDES BY COLLISION ACTIVATED DISSOCIATION ON A TRIPLE QUADRUPOLE MASS-SPECTROMETER [J].
HUNT, DF ;
BUKO, AM ;
BALLARD, JM ;
SHABANOWITZ, J ;
GIORDANI, AB .
BIOMEDICAL MASS SPECTROMETRY, 1981, 8 (09) :397-408
[10]   PROTEIN SEQUENCING BY TANDEM MASS-SPECTROMETRY [J].
HUNT, DF ;
YATES, JR ;
SHABANOWITZ, J ;
WINSTON, S ;
HAUER, CR .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1986, 83 (17) :6233-6237