Whole genome searching with shotgun proteomic data: Applications for genome annotation

被引:23
作者
Sevinsky, Joel R. [1 ]
Cargile, Benjamin J. [1 ]
Bunger, Maureen K. [1 ]
Meng, Fanyu [2 ]
Yates, Nathan A. [2 ]
Hendrickson, Ronald C. [2 ]
Stephenson, James L., Jr. [1 ]
机构
[1] Res Triangle Inst, Mass Spectrometry Program, Durham, NC 27709 USA
[2] Merck & Co Inc, Merck Res Labs, Rahway, NJ 08854 USA
关键词
isoelectric focusing; mass spectrometry; peptide identification; genome annotation; database searching;
D O I
10.1021/pr070198n
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
High-throughput genome sequencing continues to accelerate the rate at which complete genomes are available for biological research. Many of these new genome sequences have little or no genome annotation currently available and hence rely upon computational predictions of protein coding genes. Evidence of translation from proteomic techniques could facilitate experimental validation of protein coding genes, but the techniques for whole genome searching with MS/MS data have not been adequately developed to date. Here we describe GENQUEST, a novel method using peptide isoelectric focusing and accurate mass to greatly reduce the peptide search space, making fast, accurate, and sensitive whole human genome searching possible on common desktop computers. In an initial experiment, almost all exonic peptides identified in a protein database search were identified when searching genomic sequence. Many peptides identified exclusively in the genome searches were incorrectly identified or could not be experimentally validated, highlighting the importance of orthogonal validation. Experimentally validated peptides exclusive to the genomic searches can be used to reannotate protein coding genes. GENQUEST represents an experimental tool that can be used by the proteomics community at large for validating computational approaches to genome annotation.
引用
收藏
页码:80 / 88
页数:9
相关论文
共 25 条
[1]  
ALTSCHUL SF, 1990, J MOL BIOL, V215, P3
[2]  
CARGILE B, 2006, IN PRESS ELECTROPHOR
[3]  
Cargile Benjamin J, 2005, J Biomol Tech, V16, P181
[4]   Gel based isoelectric focusing of peptides and the utility of isoelectric point in protein identification [J].
Cargile, BJ ;
Bundy, JL ;
Freeman, TW ;
Stephenson, JL .
JOURNAL OF PROTEOME RESEARCH, 2004, 3 (01) :112-119
[5]   Potential for false positive identifications from large databases through tandem mass spectrometry [J].
Cargile, BJ ;
Bundy, JL ;
Stephenson, JL .
JOURNAL OF PROTEOME RESEARCH, 2004, 3 (05) :1082-1085
[6]   Immobilized pH gradients as a first dimension in shotgun proteomics and analysis of the accuracy of pI predictability of peptides [J].
Cargile, BJ ;
Talley, DL ;
Stephenson, JL .
ELECTROPHORESIS, 2004, 25 (06) :936-945
[7]  
Choudhary JS, 2001, PROTEOMICS, V1, P651, DOI 10.1002/1615-9861(200104)1:5<651::AID-PROT651>3.0.CO
[8]  
2-N
[9]   Experiments in searching small proteins in unannotated large eukaryotic genomes [J].
Colinge, J ;
Cusin, I ;
Reffas, S ;
Mahé, E ;
Niknejad, A ;
Rey, PA ;
Mattou, H ;
Moniatte, M ;
Bougueleret, L .
JOURNAL OF PROTEOME RESEARCH, 2005, 4 (01) :167-174
[10]   TANDEM: matching proteins with tandem mass spectra [J].
Craig, R ;
Beavis, RC .
BIOINFORMATICS, 2004, 20 (09) :1466-1467