State of the Human Proteome in 2014/2015 As Viewed through PeptideAtlas: Enhancing Accuracy and Coverage through the AtlasProphet

被引:59
作者
Deutsch, Eric W. [1 ]
Sun, Zhi [1 ]
Campbell, David [1 ]
Kusebauch, Ulrike [1 ]
Chu, Caroline S. [1 ]
Mendoza, Luis [1 ]
Shteynberg, David [1 ]
Omenn, Gilbert S. [1 ,2 ,3 ,4 ,5 ]
Moritz, Robert L. [1 ]
机构
[1] Inst Syst Biol, Seattle, WA 98109 USA
[2] Univ Michigan, Dept Computat Med & Bioinformat, Ann Arbor, MI 48109 USA
[3] Univ Michigan, Dept Internal Med, Ann Arbor, MI 48109 USA
[4] Univ Michigan, Dept Human Genet, Ann Arbor, MI 48109 USA
[5] Univ Michigan, Sch Publ Hlth, Ann Arbor, MI 48109 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
shotgun proteomics; tandem mass spectrometry; repositories; PeptideAtlas; Human Proteome Project; observed proteome; MASS-SPECTROMETRY; STATISTICAL-MODEL; TANDEM; DATABASE; IDENTIFICATION; PROTEINS; RESOURCE; BIOLOGY; SYSTEM; DRAFT;
D O I
10.1021/acs.jproteome.5b00500
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The Human Peptide Atlas is a compendium of the highest quality peptide identifications from over 1000 shotgun mass spectrometry proteomics experiments collected from many different laboratories, all reanalyzed through a uniform processing pipeline. The latest 2015-03 build contains substantially more input data than past releases, is mapped to a recent version of our merged reference proteome, and uses improved informatics processing and the development of the Atlas Prophet to provide the highest quality results. Within the set of similar to 20 000 neXtProt primary entries, 14 070 (70%) are confidently detected in the latest build, 5% are ambiguous, 9% are redundant, leaving the total percentage of proteins for which there are no mapping detections at just 16% (3166), all derived from over 133 million peptide-spectrum matches identifying more than 1 million distinct peptides using AtlasProphet to characterize and classify the protein matches. Improved handling for detection and presentation of single amino-acid variants (SAAVs) reveals the detection of 5326 uniquely mapping SAAVs across 2794 proteins. With such a large amount of data, the control of false positives is a challenge. We present the methodology and results for maintaining rigorous quality along with a discussion of the implications of the remaining sources of errors in the build.
引用
收藏
页码:3461 / 3473
页数:13
相关论文
共 40 条
[1]   Mass spectrometry-based proteomics [J].
Aebersold, R ;
Mann, M .
NATURE, 2003, 422 (6928) :198-207
[2]  
[Anonymous], 2014, CURRENT PROTOCOLS BI
[3]  
Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkh131, 10.1093/nar/gkw1099]
[4]   Activities at the Universal Protein Resource (UniProt) [J].
Apweiler, Rolf ;
Bateman, Alex ;
Martin, Maria Jesus ;
O'Donovan, Claire ;
Magrane, Michele ;
Alam-Faruque, Yasmin ;
Alpi, Emanuele ;
Antunes, Ricardo ;
Arganiska, Joanna ;
Casanova, Elisabet Barrera ;
Bely, Benoit ;
Bingley, Mark ;
Bonilla, Carlos ;
Britto, Ramona ;
Bursteinas, Borisas ;
Chan, Wei Mun ;
Chavali, Gayatri ;
Cibrian-Uhalte, Elena ;
Da Silva, Alan ;
De Giorgi, Maurizio ;
Dogan, Tunca ;
Fazzini, Francesco ;
Gane, Paul ;
Castro, Leyla Garcia ;
Garmiri, Penelope ;
Hatton-Ellis, Emma ;
Hieta, Reija ;
Huntley, Rachael ;
Legge, Duncan ;
Liu, Wudong ;
Luo, Jie ;
MacDougall, Alistair ;
Mutowo, Prudence ;
Nightingale, Andrew ;
Orchard, Sandra ;
Pichler, Klemens ;
Poggioli, Diego ;
Pundir, Sangya ;
Pureza, Luis ;
Qi, Guoying ;
Rosanoff, Steven ;
Saidi, Rabie ;
Sawford, Tony ;
Shypitsyna, Aleksandra ;
Turner, Edward ;
Volynkin, Vladimir ;
Wardell, Tony ;
Watkins, Xavier ;
Zellner, Hermann ;
Corbett, Matt .
NUCLEIC ACIDS RESEARCH, 2014, 42 (D1) :D191-D198
[5]   A Bovine PeptideAtlas of milk and mammary gland proteomes [J].
Bislev, Stine L. ;
Deutsch, Eric W. ;
Sun, Zhi ;
Farrah, Terry ;
Aebersold, Ruedi ;
Moritz, Robert L. ;
Bendixen, Emoke ;
Codrea, Marius C. .
PROTEOMICS, 2012, 12 (18) :2895-2899
[6]   The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 [J].
Boeckmann, B ;
Bairoch, A ;
Apweiler, R ;
Blatter, MC ;
Estreicher, A ;
Gasteiger, E ;
Martin, MJ ;
Michoud, K ;
O'Donovan, C ;
Phan, I ;
Pilbout, S ;
Schneider, M .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :365-370
[7]   The Equine PeptideAtlas: A resource for developing proteomics-based veterinary research [J].
Bundgaard, Louise ;
Jacobsen, Stine ;
Sorensen, Mette A. ;
Sun, Zhi ;
Deutsch, Eric W. ;
Moritz, Robert L. ;
Bendixen, Emoke .
PROTEOMICS, 2014, 14 (06) :763-773
[8]   Open source system for analyzing, validating, and storing protein identification data [J].
Craig, R ;
Cortens, JP ;
Beavis, RC .
JOURNAL OF PROTEOME RESEARCH, 2004, 3 (06) :1234-1242
[9]   TANDEM: matching proteins with tandem mass spectra [J].
Craig, R ;
Beavis, RC .
BIOINFORMATICS, 2004, 20 (09) :1466-1467
[10]  
Desiere F, 2005, GENOME BIOL, V6