Can the false-discovery rate be misleading?

被引:35
作者
Barboza, Rodrigo [2 ]
Cociorva, Daniel [3 ]
Xu, Tao [3 ]
Barbosa, Valmir C. [2 ]
Perales, Jonas [1 ]
Valente, Richard H. [1 ]
Franca, Felipe M. G. [2 ]
Yates, John R., III [3 ]
Carvalho, Paulo C. [1 ,4 ]
机构
[1] Inst Oswaldo Cruz, Lab Toxinol, Fundacao Oswaldo Cruz FIPCRUZ, BR-21045900 Rio De Janeiro, Brazil
[2] Univ Fed Rio de Janeiro, COPPE, Syst Engn & Comp Sci Program, BR-21945 Rio De Janeiro, Brazil
[3] Scripps Res Inst, Dept Physiol Chem, La Jolla, CA 92037 USA
[4] Fundacao Oswaldo Cruz, Ctr Technol Dev Hlth, Rio De Janeiro, Brazil
关键词
Bioinformatics; Decoy; False-discovery rate; Overfitting; Protein identification; Shotgun proteomics; PEPTIDE IDENTIFICATION; PROTEIN;
D O I
10.1002/pmic.201100297
中图分类号
Q5 [生物化学];
学科分类号
070307 [化学生物学];
摘要
The decoy-database approach is currently the gold standard for assessing the confidence of identifications in shotgun proteomic experiments. Here, we demonstrate that what might appear to be a good result under the decoy-database approach for a given false-discovery rate could be, in fact, the product of overfitting. This problem has been overlooked until now and could lead to obtaining boosted identification numbers whose reliability does not correspond to the expected false-discovery rate. To overcome this, we are introducing a modified version of the method, termed a semi-labeled decoy approach, which enables the statistical determination of an overfitted result.
引用
收藏
页码:4105 / 4108
页数:4
相关论文
共 9 条
[1]
Cociorva Daniel, 2007, Curr Protoc Bioinformatics, VChapter 13, DOI 10.1002/0471250953.bi1304s16
[2]
Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry [J].
Elias, Joshua E. ;
Gygi, Steven P. .
NATURE METHODS, 2007, 4 (03) :207-214
[3]
Producing pattern examples from "mental" images [J].
Grieco, Bruno P. A. ;
Lima, Priscila M. V. ;
De Gregorio, Massimo ;
Franca, Felipe M. G. .
NEUROCOMPUTING, 2010, 73 (7-9) :1057-1064
[4]
Semi-supervised learning for peptide identification from shotgun proteomics datasets [J].
Kall, Lukas ;
Canterbury, Jesse D. ;
Weston, Jason ;
Noble, William Stafford ;
MacCoss, Michael J. .
NATURE METHODS, 2007, 4 (11) :923-925
[5]
IDPicker 2.0: Improved Protein Assembly with High Discrimination Peptide Identification Filtering [J].
Ma, Ze-Qiang ;
Dasari, Surendra ;
Chambers, Matthew C. ;
Litton, Michael D. ;
Sobecki, Scott M. ;
Zimmerman, Lisa J. ;
Halvey, Patrick J. ;
Schilling, Birgit ;
Drake, Penelope M. ;
Gibson, Bradford W. ;
Tabb, David L. .
JOURNAL OF PROTEOME RESEARCH, 2009, 8 (08) :3872-3881
[6]
Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: The yeast proteome [J].
Peng, JM ;
Elias, JE ;
Thoreen, CC ;
Licklider, LJ ;
Gygi, SP .
JOURNAL OF PROTEOME RESEARCH, 2003, 2 (01) :43-50
[7]
Soos BC, 2002, SENSOR REV, V37, P587
[8]
Large-scale analysis of the yeast proteome by multidimensional protein identification technology [J].
Washburn, MP ;
Wolters, D ;
Yates, JR .
NATURE BIOTECHNOLOGY, 2001, 19 (03) :242-247
[9]
Xu T, 2006, MOL CELL PROTEOMICS, V5, pS174