Challenges in deriving high-confidence protein identifications from data gathered by a HUPO plasma proteome collaborative study

被引:239
作者
States, DJ
Omenn, GS
Blackwell, TW
Fermin, D
Eng, J
Speicher, DW
Hanash, SM
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
[2] Fred Hutchinson Canc Res Ctr, Seattle, WA 98109 USA
[3] Wistar Inst Anat & Biol, Philadelphia, PA 19104 USA
关键词
D O I
10.1038/nbt1183
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
The Human Proteome Organization (HUPO) recently completed the first large-scale collaborative study to characterize the human serum and plasma proteomes. The study was carried out in different locations and used diverse methods and instruments to compare and integrate tandem mass spectrometry (MS/MS) data on aliquots of pooled serum and plasma from healthy subjects. Liquid chromatography (LC)-MS/MS data sets from 18 laboratories were matched to the International Protein Index database, and an initial integration exercise resulted in 9,504 proteins identified with one or more peptides, and 3,020 proteins identified with two or more peptides. This article uses a rigorous statistical approach to take into account the length of coding regions in genes, and multiple hypothesis-testing techniques. On this basis, we now present a reduced set of 889 proteins identified with a confidence level of at least 95%. We also discuss the importance of such an integrated analysis in providing an accurate representation of a proteome as well as the value such data sets contain for the high-confidence identification of protein matches to novel exons, some of which may be localized in alternatively spliced forms of known plasma proteins and some in previously nonannotated gene sequences.
引用
收藏
页码:333 / 338
页数:6
相关论文
共 42 条
[1]   Data management and preliminary data analysis in the pilot phase of the HUPO Plasma Proteome Project [J].
Adamski, M ;
Blackwell, T ;
Menon, R ;
Martens, L ;
Hermjakob, H ;
Taylor, C ;
Omenn, GS ;
States, DJ .
PROTEOMICS, 2005, 5 (13) :3246-3261
[2]   Mass spectrometry-based proteomics [J].
Aebersold, R ;
Mann, M .
NATURE, 2003, 422 (6928) :198-207
[3]   The human plasma proteome - A nonredundant list developed by combination of four separate sources [J].
Anderson, NL ;
Polanski, M ;
Pieper, R ;
Gatlin, T ;
Tirumalai, RS ;
Conrads, TP ;
Veenstra, TD ;
Adkins, JN ;
Pounds, JG ;
Fagan, R ;
Lobley, A .
MOLECULAR & CELLULAR PROTEOMICS, 2004, 3 (04) :311-326
[4]   Improving large-scale proteomics by clustering of mass spectrometry data [J].
Beer, I ;
Barnea, E ;
Ziv, T ;
Admon, A .
PROTEOMICS, 2004, 4 (04) :950-960
[5]   Improved prediction of signal peptides: SignalP 3.0 [J].
Bendtsen, JD ;
Nielsen, H ;
von Heijne, G ;
Brunak, S .
JOURNAL OF MOLECULAR BIOLOGY, 2004, 340 (04) :783-795
[6]   Potential for false positive identifications from large databases through tandem mass spectrometry [J].
Cargile, BJ ;
Bundy, JL ;
Stephenson, JL .
JOURNAL OF PROTEOME RESEARCH, 2004, 3 (05) :1082-1085
[7]   The need for guidelines in publication of peptide and protein identification data - Working group on publication guidelines for peptide and protein identification data [J].
Carr, S ;
Aebersold, R ;
Baldwin, M ;
Burlingame, A ;
Clauser, K ;
Nesvizhskii, A .
MOLECULAR & CELLULAR PROTEOMICS, 2004, 3 (06) :531-533
[8]  
Chan K. C., 2004, CLIN PROTEOM, V1, P101, DOI DOI 10.1385/CP:1:2:101
[9]  
Choudhary JS, 2001, PROTEOMICS, V1, P651, DOI 10.1002/1615-9861(200104)1:5<651::AID-PROT651>3.0.CO
[10]  
2-N