TagRecon: High-Throughput Mutation Identification through Sequence Tagging

被引:84
作者
Dasari, Surendra [1 ]
Chambers, Matthew C. [1 ]
Slebos, Robbert J. [2 ,3 ]
Zimmerman, Lisa J. [3 ,4 ]
Ham, Amy-Joan L. [3 ,4 ]
Tabb, David L. [1 ,3 ,4 ,5 ]
机构
[1] Vanderbilt Univ, Med Ctr, Dept Biomed Informat, Nashville, TN 37232 USA
[2] Vanderbilt Ingram Canc Ctr, Dept Canc Biol, Nashville, TN 37232 USA
[3] Vanderbilt Ingram Canc Ctr, Jim Ayers Inst Precanc Detect & Diag, Nashville, TN 37232 USA
[4] Vanderbilt Univ, Med Ctr, Dept Biochem, Nashville, TN 37232 USA
[5] Vanderbilt Univ, Med Ctr, Mass Spectrometry Res Ctr, Nashville, TN 37232 USA
关键词
mutation; bioinformatics; hydroxyproline; sequence tagging; TANDEM MASS-SPECTRA; ELEVATED MUTANT FREQUENCIES; POSTTRANSLATIONAL MODIFICATIONS; PEPTIDE IDENTIFICATION; PROTEIN MODIFICATIONS; TRANSITION MUTATIONS; SHOTGUN PROTEOMICS; SPECTROMETRY; CANCER; ALGORITHM;
D O I
10.1021/pr900850m
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Shotgun proteomics produces collections of tandem mass spectra that contain all the data needed to identify mutated peptides from clinical samples. Identifying these sequence variations, however, has not been feasible with conventional database search strategies, which require exact matches between observed and expected sequences. Searching for mutations as mass shifts on specified residues through database search can incur significant performance penalties and generate substantial false positive rates. Here we describe TagRecon, an algorithm that leverages inferred sequence tags to identify unanticipated mutations in clinical proteomic data sets. TagRecon identifies unmodified peptides as sensitively as the related MyriMatch database search engine. In both LTQ and Orbitrap data sets, TagRecon outperformed state of the art software in recognizing sequence mismatches from data sets with known variants. We developed guidelines for filtering putative mutations from clinical samples, and we applied them in an analysis of cancer cell lines and an examination of colon tissue. Mutations were found in up to 6% of identified peptides, and only a small fraction corresponded to dbSNP entries. The RKO cell line, which is DNA mismatch repair deficient, yielded more mutant peptides than the mismatch repair proficient SW480 line. Analysis of colon cancer tumor and adjacent tissue revealed hydroxyproline modifications associated with extracellular matrix degradation. These results demonstrate the value of using sequence tagging algorithms to fully interrogate clinical proteomic data sets.
引用
收藏
页码:1716 / 1726
页数:11
相关论文
共 41 条
[1]  
[Anonymous], 2005, R LANG ENV STAT COMP
[2]   Emerging Paradigms in Cancer Genetics: Some Important Findings from High-Density Single Nucleotide Polymorphism Array Studies [J].
Bacolod, Manny D. ;
Schemmann, Gunter S. ;
Giardina, Sarah F. ;
Paty, Philip ;
Notterman, Daniel A. ;
Barany, Francis .
CANCER RESEARCH, 2009, 69 (03) :723-727
[3]   Elevated mutant frequencies and increased C:G→T:A transitions in Mlh1-/- versus Pms2-/- murine small intestinal epithelial cells [J].
Baross-Francis, A ;
Makhani, N ;
Liskay, RM ;
Jirik, FR .
ONCOGENE, 2001, 20 (05) :619-625
[4]   Automatic Quality Assessment of Peptide Tandem Mass Spectra [J].
Bern, Marshall ;
Goldberg, David ;
McDonald, W. Hayes ;
Yates, John R., III .
BIOINFORMATICS, 2004, 20 :49-54
[5]  
Bode MK, 2000, SCAND J GASTROENTERO, V35, P747
[6]   Detection and validation of non-synonymous coding SNPs from orthogonal analysis of shotgun proteomics data [J].
Bunger, Maureen K. ;
Cargile, Benjamin J. ;
Sevinsky, Joel R. ;
Deyanova, Ekaterina ;
Yates, Nathan A. ;
Hendrickson, Ronald C. ;
Stephenson, James L., Jr. .
JOURNAL OF PROTEOME RESEARCH, 2007, 6 (06) :2331-2340
[7]   A method for reducing the time required to match protein sequences with tandem mass spectra [J].
Craig, R ;
Beavis, RC .
RAPID COMMUNICATIONS IN MASS SPECTROMETRY, 2003, 17 (20) :2310-2316
[8]   TANDEM: matching proteins with tandem mass spectra [J].
Craig, R ;
Beavis, RC .
BIOINFORMATICS, 2004, 20 (09) :1466-1467
[9]   Unimod: Protein modifications for mass spectrometry [J].
Creasy, DM ;
Cottrell, JS .
PROTEOMICS, 2004, 4 (06) :1534-1536
[10]   Novel peptide identification from tandem mass spectra using ESTs and sequence database compression [J].
Edwards, Nathan J. .
MOLECULAR SYSTEMS BIOLOGY, 2007, 3 (1)