Target-Decoy Approach and False Discovery Rate: When Things May Go Wrong

被引:116
作者
Gupta, Nitin [1 ]
Bandeira, Nuno [2 ,4 ]
Keich, Uri [3 ]
Pevzner, Pavel A. [1 ,2 ]
机构
[1] Univ Calif San Diego, Bioinformat Program, La Jolla, CA 92093 USA
[2] Univ Calif San Diego, Dept Comp Sci & Engn, La Jolla, CA 92093 USA
[3] Univ Sydney, Sch Math & Stat, Sydney, NSW 2006, Australia
[4] Univ Calif San Diego, Skaggs Sch Pharm & Pharmaceut Sci, La Jolla, CA 92093 USA
基金
美国国家卫生研究院;
关键词
Computational proteomics; Target-decoy approach; False discovery rate; False positive rate; Database search; Decoy database; P-value; TANDEM MASS-SPECTRA; STATISTICAL SIGNIFICANCE; PEPTIDE IDENTIFICATION; PROTEIN IDENTIFICATIONS; PROBABILITIES; SEQUENCES; STRIKE;
D O I
10.1007/s13361-011-0139-3
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The target-decoy approach (TDA) has done the field of proteomics a great service by filling in the need to estimate the false discovery rates (FDR) of peptide identifications. While TDA is often viewed as a universal solution to the problem of FDR evaluation, we argue that the time has come to critically re-examine TDA and to acknowledge not only its merits but also its demerits. We demonstrate that some popular MS/MS search tools are not TDA-compliant and that it is easy to develop a non-TDA compliant tool that outperforms all TDA-compliant tools. Since the distinction between TDA-compliant and non-TDA compliant tools remains elusive, we are concerned about a possible proliferation of non-TDA-compliant tools in the future (developed with the best intentions). We are also concerned that estimation of the FDR by TDA awkwardly depends on a virtual coin toss and argue that it is important to take the coin toss factor out of our estimation of the FDR. Since computing FDR via TDA suffers from various restrictions, we argue that TDA is not needed when accurate p-values of individual Peptide-Spectrum Matches are available.
引用
收藏
页码:1111 / 1120
页数:10
相关论文
共 33 条
[1]  
Altschul SF, 1996, METHOD ENZYMOL, V266, P460
[2]   Statistical characterization of a 1D random potential problem-With applications in score statistics of MS-based peptide sequencing [J].
Alves, Gelio ;
Yu, Yi-Kuo .
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2008, 387 (26) :6538-6544
[3]   Comparative evaluation of tandem MS search algorithms using a target-decoy search strategy [J].
Balgley, Brian M. ;
Laudeman, Tom ;
Yang, Li ;
Song, Tao ;
Lee, Cheng S. .
MOLECULAR & CELLULAR PROTEOMICS, 2007, 6 (09) :1599-1608
[4]   Reanalysis of Tyrannosaurus rex Mass Spectra [J].
Bern, Marshall ;
Phinney, Brett S. ;
Goldberg, David .
JOURNAL OF PROTEOME RESEARCH, 2009, 8 (09) :4328-4332
[5]  
BERN MW, 2011, J PROTEOME IN PRESS
[6]   Discovery and revision of Arabidopsis genes by proteogenomics [J].
Castellana, Natalie E. ;
Payne, Samuel H. ;
Shen, Zhouxin ;
Stanke, Mario ;
Bafna, Vineet ;
Briggs, Steven P. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2008, 105 (52) :21034-21038
[7]   False discovery rates and related statistical concepts in mass spectrometry-based proteomics [J].
Choi, Hyungwon ;
Nesvizhskii, Alexey I. .
JOURNAL OF PROTEOME RESEARCH, 2008, 7 (01) :47-50
[8]   A method for reducing the time required to match protein sequences with tandem mass spectra [J].
Craig, R ;
Beavis, RC .
RAPID COMMUNICATIONS IN MASS SPECTROMETRY, 2003, 17 (20) :2310-2316
[9]  
DAYHOFF MO, 1983, METHOD ENZYMOL, V91, P524
[10]   Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry [J].
Elias, Joshua E. ;
Gygi, Steven P. .
NATURE METHODS, 2007, 4 (03) :207-214