A method for assessing the statistical significance of mass spectrometry-based protein identifications using general scoring schemes

被引:384
作者
Fenyö, D [1 ]
Beavis, RC [1 ]
机构
[1] Genomic Solut Canada Inc, Winnipeg, MB R3B 1G7, Canada
关键词
D O I
10.1021/ac0258709
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
This paper investigates the use of survival functions and expectation values to evaluate the results of protein identification experiments. These functions are standard statistical measures that can be used to reduce various protein identification scoring schemes to a common, easily interpretably representation. The relative merits of scoring systems were explored using this approach, as well as the effects of altering primary identification parameters. We would advocate the widespread use of these simple statistical measures to simplify and standardize the reporting of the confidence of protein identification results, allowing the users of different identification algorithms to compare their results in a straightforward and statistically significant manner. A method is described for measuring these distributions using information that is being discarded by most protein identification search engines, resulting in accurate survival functions that are specific to any combination of scoring algorithms, sequence databases, and mass spectra.
引用
收藏
页码:768 / 774
页数:7
相关论文
共 20 条
  • [1] Mass spectrometry in proteomics
    Aebersold, R
    Goodlett, DR
    [J]. CHEMICAL REVIEWS, 2001, 101 (02) : 269 - 295
  • [2] Andersen JS, 2002, CURR BIOL, V12, P1, DOI 10.1016/S0960-9822(01)00650-9
  • [3] Field HI, 2002, PROTEOMICS, V2, P36, DOI 10.1002/1615-9861(200201)2:1<36::AID-PROT36>3.3.CO
  • [4] 2-N
  • [5] FILLIBEN JJ, ENG STAT HDB
  • [6] Functional organization of the yeast proteome by systematic analysis of protein complexes
    Gavin, AC
    Bösche, M
    Krause, R
    Grandi, P
    Marzioch, M
    Bauer, A
    Schultz, J
    Rick, JM
    Michon, AM
    Cruciat, CM
    Remor, M
    Höfert, C
    Schelder, M
    Brajenovic, M
    Ruffner, H
    Merino, A
    Klein, K
    Hudak, M
    Dickson, D
    Rudi, T
    Gnau, V
    Bauch, A
    Bastuck, S
    Huhse, B
    Leutwein, C
    Heurtier, MA
    Copley, RR
    Edelmann, A
    Querfurth, E
    Rybin, V
    Drewes, G
    Raida, M
    Bouwmeester, T
    Bork, P
    Seraphin, B
    Kuster, B
    Neubauer, G
    Superti-Furga, G
    [J]. NATURE, 2002, 415 (6868) : 141 - 147
  • [7] Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry
    Ho, Y
    Gruhler, A
    Heilbut, A
    Bader, GD
    Moore, L
    Adams, SL
    Millar, A
    Taylor, P
    Bennett, K
    Boutilier, K
    Yang, LY
    Wolting, C
    Donaldson, I
    Schandorff, S
    Shewnarane, J
    Vo, M
    Taggart, J
    Goudreault, M
    Muskat, B
    Alfarano, C
    Dewar, D
    Lin, Z
    Michalickova, K
    Willems, AR
    Sassi, H
    Nielsen, PA
    Rasmussen, KJ
    Andersen, JR
    Johansen, LE
    Hansen, LH
    Jespersen, H
    Podtelejnikov, A
    Nielsen, E
    Crawford, J
    Poulsen, V
    Sorensen, BD
    Matthiesen, J
    Hendrickson, RC
    Gleeson, F
    Pawson, T
    Moran, MF
    Durocher, D
    Mann, M
    Hogue, CWV
    Figeys, D
    Tyers, M
    [J]. NATURE, 2002, 415 (6868) : 180 - 183
  • [8] METHODS FOR ASSESSING THE STATISTICAL SIGNIFICANCE OF MOLECULAR SEQUENCE FEATURES BY USING GENERAL SCORING SCHEMES
    KARLIN, S
    ALTSCHUL, SF
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1990, 87 (06) : 2264 - 2268
  • [9] APPLICATIONS AND STATISTICS FOR MULTIPLE HIGH-SCORING SEGMENTS IN MOLECULAR SEQUENCES
    KARLIN, S
    ALTSCHUL, SF
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1993, 90 (12) : 5873 - 5877
  • [10] Peptide sequence motif analysis of tandem MS data with the SALSA algorithm
    Liebler, DC
    Hansen, BT
    Davey, SW
    Tiscareno, L
    Mason, DE
    [J]. ANALYTICAL CHEMISTRY, 2002, 74 (01) : 203 - 210