Protein identification in complex mixtures

被引:18
作者
Eriksson, J
Fenyö, D
机构
[1] Swedish Univ Agr Sci, Dept Chem, SE-75007 Uppsala, Sweden
[2] GE Healthcare, Piscataway, NJ 08855 USA
[3] Rockefeller Univ, New York, NY 10021 USA
关键词
protein identification; algorithm; bioinformatics; mass spectrometry; proteomics; protein; protein mixtures; complex mixtures; peptide; peptide mapping; significance testing; simulation;
D O I
10.1021/pr049816f
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
This paper investigates the prospects of successful mass spectrometric protein identification based on mass data from proteolytic digests of complex protein mixtures. Sets of proteolytic peptide masses representing various numbers of digested proteins in a mixture were generated in silico. In each set, different proteins were selected from a protein sequence collection and for each protein the sequence coverage was randomly selected within a particular regime (15-30% or 30-60%). We demonstrate that the Probity algorithm, which is characterized by an optimal tolerance for random interference, employed in an iterative procedure can correctly identify > 95% of proteins at a desired significance level in mixtures composed of hundreds of yeast proteins under realistic mass spectrometric experimental constraints. By using a model of the distribution of protein abundance, we demonstrate that the very high efficiency of identification of protein mixtures that can be achieved by appropriate choices of informatics procedures is hampered by limitations of the mass spectrometric dynamic range. The results stress the desire to choose carefully experimental protocols for comprehensive proteome analysis, focusing on truly critical issues such as the dynamic range, which potentially limits the possibilities of identifying low abundance proteins.
引用
收藏
页码:387 / 393
页数:7
相关论文
共 30 条
  • [1] The genome sequence of Drosophila melanogaster
    Adams, MD
    Celniker, SE
    Holt, RA
    Evans, CA
    Gocayne, JD
    Amanatides, PG
    Scherer, SE
    Li, PW
    Hoskins, RA
    Galle, RF
    George, RA
    Lewis, SE
    Richards, S
    Ashburner, M
    Henderson, SN
    Sutton, GG
    Wortman, JR
    Yandell, MD
    Zhang, Q
    Chen, LX
    Brandon, RC
    Rogers, YHC
    Blazej, RG
    Champe, M
    Pfeiffer, BD
    Wan, KH
    Doyle, C
    Baxter, EG
    Helt, G
    Nelson, CR
    Miklos, GLG
    Abril, JF
    Agbayani, A
    An, HJ
    Andrews-Pfannkoch, C
    Baldwin, D
    Ballew, RM
    Basu, A
    Baxendale, J
    Bayraktaroglu, L
    Beasley, EM
    Beeson, KY
    Benos, PV
    Berman, BP
    Bhandari, D
    Bolshakov, S
    Borkova, D
    Botchan, MR
    Bouck, J
    Brokstein, P
    [J]. SCIENCE, 2000, 287 (5461) : 2185 - 2195
  • [2] The human plasma proteome - History, character, and diagnostic prospects
    Anderson, NL
    Anderson, NG
    [J]. MOLECULAR & CELLULAR PROTEOMICS, 2002, 1 (11) : 845 - 867
  • [3] [Anonymous], 2001, NATURE, P409
  • [4] Beavis R C, 1989, Rapid Commun Mass Spectrom, V3, P233, DOI 10.1002/rcm.1290030708
  • [5] BEAVIS RC, 2000, PROTEOMICS TRENDS GU, P22
  • [6] The statistical significance of protein identification results as a function of the number of protein sequences searched
    Eriksson, J
    Fenyö, D
    [J]. JOURNAL OF PROTEOME RESEARCH, 2004, 3 (05) : 979 - 982
  • [7] Probity:: A protein identification algorithm with accurate assignment of the statistical significance of the results
    Eriksson, J
    Fenyö, D
    [J]. JOURNAL OF PROTEOME RESEARCH, 2004, 3 (01) : 32 - 36
  • [8] A statistical basis for testing the significance of mass spectrometric protein identification results
    Eriksson, J
    Chait, BT
    Fenyö, D
    [J]. ANALYTICAL CHEMISTRY, 2000, 72 (05) : 999 - 1005
  • [9] Eriksson J, 2002, PROTEOMICS, V2, P262, DOI 10.1002/1615-9861(200203)2:3<262::AID-PROT262>3.0.CO
  • [10] 2-W