An Integrated Mass-Spectrometry Pipeline Identifies Novel Protein Coding-Regions in the Human Genome

被引:26
作者
Bitton, Danny A. [1 ]
Smith, Duncan L. [2 ]
Connolly, Yvonne [2 ]
Scutt, Paul J. [1 ]
Miller, Crispin J. [1 ]
机构
[1] Univ Manchester, Paterson Inst Canc Res, Canc Res UK, Appl Computat Biol & Bioinformat Grp, Manchester, Lancs, England
[2] Univ Manchester, Paterson Inst Canc Res, Canc Res UK, Biol Mass Spectrometry Facil, Manchester, Lancs, England
来源
PLOS ONE | 2010年 / 5卷 / 01期
关键词
FALSE DISCOVERY RATES; TILING ARRAYS; GENE; ANNOTATION; SEARCH; DATABASES; PEPTIDES; SEQUENCE; MODEL; PROTEOMICS;
D O I
10.1371/journal.pone.0008949
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Most protein mass spectrometry (MS) experiments rely on searches against a database of known or predicted proteins, limiting their ability as a gene discovery tool. Results: Using a search against an in silico translation of the entire human genome, combined with a series of annotation filters, we identified 346 putative novel peptides [False Discovery Rate (FDR), <5%] in a MS dataset derived from two human breast epithelial cell lines. A subset of these were then successfully validated by a different MS technique. Two of these correspond to novel isoforms of Heterogeneous Ribonuclear Proteins, while the rest correspond to novel loci. Conclusions: MS technology can be used for ab initio gene discovery in human data, which, since it is based on different underlying assumptions, identifies protein-coding genes not found by other techniques. As MS technology continues to evolve, such approaches will become increasingly powerful.
引用
收藏
页数:10
相关论文
共 60 条
[1]   Mass spectrometry-based proteomics [J].
Aebersold, R ;
Mann, M .
NATURE, 2003, 422 (6928) :198-207
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]   The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling [J].
Arnold, K ;
Bordoli, L ;
Kopp, J ;
Schwede, T .
BIOINFORMATICS, 2006, 22 (02) :195-201
[4]   Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics [J].
Baerenfaller, Katja ;
Grossmann, Jonas ;
Grobei, Monica A. ;
Hull, Roger ;
Hirsch-Hoffmann, Matthias ;
Yalovsky, Shaul ;
Zimmermann, Philip ;
Grossniklaus, Ueli ;
Gruissem, Wilhelm ;
Baginsky, Sacha .
SCIENCE, 2008, 320 (5878) :938-941
[5]   Global identification of human transcribed sequences with genome tiling arrays [J].
Bertone, P ;
Stolc, V ;
Royce, TE ;
Rozowsky, JS ;
Urban, AE ;
Zhu, XW ;
Rinn, JL ;
Tongprasit, W ;
Samanta, M ;
Weissman, S ;
Gerstein, M ;
Snyder, M .
SCIENCE, 2004, 306 (5705) :2242-2246
[6]   Ensembl 2006 [J].
Birney, E. ;
Andrews, D. ;
Caccamo, M. ;
Chen, Y. ;
Clarke, L. ;
Coates, G. ;
Cox, T. ;
Cunningham, F. ;
Curwen, V. ;
Cutts, T. ;
Down, T. ;
Durbin, R. ;
Fernandez-Suarez, X. M. ;
Flicek, P. ;
Graf, S. ;
Hammond, M. ;
Herrero, J. ;
Howe, K. ;
Iyer, V. ;
Jekosch, K. ;
Kahari, A. ;
Kasprzyk, A. ;
Keefe, D. ;
Kokocinski, F. ;
Kulesha, E. ;
London, D. ;
Longden, I. ;
Melsopp, C. ;
Meidl, P. ;
Overduin, B. ;
Parker, A. ;
Proctor, G. ;
Prlic, A. ;
Rae, M. ;
Rios, D. ;
Redmond, S. ;
Schuster, M. ;
Sealy, I. ;
Searle, S. ;
Severin, J. ;
Slater, G. ;
Smedley, D. ;
Smith, J. ;
Stabenau, A. ;
Stalker, J. ;
Trevanion, S. ;
Ureta-Vidal, A. ;
Vogel, J. ;
White, S. ;
Woodwark, C. .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D556-D561
[7]   Prediction of complete gene structures in human genomic DNA [J].
Burge, C ;
Karlin, S .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) :78-94
[8]   The need for guidelines in publication of peptide and protein identification data - Working group on publication guidelines for peptide and protein identification data [J].
Carr, S ;
Aebersold, R ;
Baldwin, M ;
Burlingame, A ;
Clauser, K ;
Nesvizhskii, A .
MOLECULAR & CELLULAR PROTEOMICS, 2004, 3 (06) :531-533
[9]   Discovery and revision of Arabidopsis genes by proteogenomics [J].
Castellana, Natalie E. ;
Payne, Samuel H. ;
Shen, Zhouxin ;
Stanke, Mario ;
Bafna, Vineet ;
Briggs, Steven P. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2008, 105 (52) :21034-21038
[10]   Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution [J].
Cheng, J ;
Kapranov, P ;
Drenkow, J ;
Dike, S ;
Brubaker, S ;
Patel, S ;
Long, J ;
Stern, D ;
Tammana, H ;
Helt, G ;
Sementchenko, V ;
Piccolboni, A ;
Bekiranov, S ;
Bailey, DK ;
Ganesh, M ;
Ghosh, S ;
Bell, I ;
Gerhard, DS ;
Gingeras, TR .
SCIENCE, 2005, 308 (5725) :1149-1154