Identifying protein-coding genes in genomic sequences

被引:61
作者
Harrow, Jennifer [2 ]
Nagy, Alinda [3 ]
Reymond, Alexandre [4 ]
Alioto, Tyler [1 ]
Patthy, Laszlo [3 ]
Antonarakis, Stylianos E. [5 ,6 ]
Guigo, Roderic [1 ]
机构
[1] Univ Pompeu Fabra, Inst Municipal Invest Med, Ctr Regulacio Genom, E-08003 Barcelona, Catalonia, Spain
[2] Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England
[3] Hungarian Acad Sci, Biol Res Ctr, Inst Enzymol, H-1113 Budapest, Hungary
[4] Univ Lausanne, Ctr Integrat Genom, CH-1015 Lausanne, Switzerland
[5] Univ Geneva, Sch Med, Dept Genet Med & Dev, CH-1211 Geneva, Switzerland
[6] Univ Hosp Geneva, CH-1211 Geneva, Switzerland
来源
GENOME BIOLOGY | 2009年 / 10卷 / 01期
基金
英国惠康基金;
关键词
TRANSCRIPTIONAL LANDSCAPE; ENCODE REGIONS; RNAS; IDENTIFICATION; ANNOTATION; PREDICTION; DISCOVERY; NUMBER; MAPS; VIEW;
D O I
10.1186/gb-2009-10-1-201
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
The vast majority of the biology of a newly sequenced genome is inferred from the set of encoded proteins. Predicting this set is therefore invariably the first step after the completion of the genome DNA sequence. Here we review the main computational pipelines used to generate the human reference protein-coding gene sets.
引用
收藏
页数:8
相关论文
共 60 条
  • [1] Transcription-mediated gene fusion in the human genome
    Akiva, P
    Toporik, A
    Edelheit, S
    Peretz, Y
    Diber, A
    Shemesh, R
    Novik, A
    Sorek, R
    [J]. GENOME RESEARCH, 2006, 16 (01) : 30 - 36
  • [2] [Anonymous], GENCODE
  • [3] Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project
    Birney, Ewan
    Stamatoyannopoulos, John A.
    Dutta, Anindya
    Guigo, Roderic
    Gingeras, Thomas R.
    Margulies, Elliott H.
    Weng, Zhiping
    Snyder, Michael
    Dermitzakis, Emmanouil T.
    Stamatoyannopoulos, John A.
    Thurman, Robert E.
    Kuehn, Michael S.
    Taylor, Christopher M.
    Neph, Shane
    Koch, Christoph M.
    Asthana, Saurabh
    Malhotra, Ankit
    Adzhubei, Ivan
    Greenbaum, Jason A.
    Andrews, Robert M.
    Flicek, Paul
    Boyle, Patrick J.
    Cao, Hua
    Carter, Nigel P.
    Clelland, Gayle K.
    Davis, Sean
    Day, Nathan
    Dhami, Pawandeep
    Dillon, Shane C.
    Dorschner, Michael O.
    Fiegler, Heike
    Giresi, Paul G.
    Goldy, Jeff
    Hawrylycz, Michael
    Haydock, Andrew
    Humbert, Richard
    James, Keith D.
    Johnson, Brett E.
    Johnson, Ericka M.
    Frum, Tristan T.
    Rosenzweig, Elizabeth R.
    Karnani, Neerja
    Lee, Kirsten
    Lefebvre, Gregory C.
    Navas, Patrick A.
    Neri, Fidencio
    Parker, Stephen C. J.
    Sabo, Peter J.
    Sandstrom, Richard
    Shafer, Anthony
    [J]. NATURE, 2007, 447 (7146) : 799 - 816
  • [4] Mapping of small RNAs in the human ENCODE regions
    Borel, Christelle
    Gagnebin, Maryline
    Gehrig, Corinne
    Kriventseva, Evgenia V.
    Zdobnov, Evgeny M.
    Antonarakis, Stylianos E.
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2008, 82 (04) : 971 - 981
  • [5] Recent advances in gene structure prediction
    Brent, MR
    Guigó, R
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 2004, 14 (03) : 264 - 272
  • [6] The transcriptional landscape of the mammalian genome
    Carninci, P
    Kasukawa, T
    Katayama, S
    Gough, J
    Frith, MC
    Maeda, N
    Oyama, R
    Ravasi, T
    Lenhard, B
    Wells, C
    Kodzius, R
    Shimokawa, K
    Bajic, VB
    Brenner, SE
    Batalov, S
    Forrest, ARR
    Zavolan, M
    Davis, MJ
    Wilming, LG
    Aidinis, V
    Allen, JE
    Ambesi-Impiombato, X
    Apweiler, R
    Aturaliya, RN
    Bailey, TL
    Bansal, M
    Baxter, L
    Beisel, KW
    Bersano, T
    Bono, H
    Chalk, AM
    Chiu, KP
    Choudhary, V
    Christoffels, A
    Clutterbuck, DR
    Crowe, ML
    Dalla, E
    Dalrymple, BP
    de Bono, B
    Della Gatta, G
    di Bernardo, D
    Down, T
    Engstrom, P
    Fagiolini, M
    Faulkner, G
    Fletcher, CF
    Fukushima, T
    Furuno, M
    Futaki, S
    Gariboldi, M
    [J]. SCIENCE, 2005, 309 (5740) : 1559 - 1563
  • [7] Distinguishing protein-coding and noncoding genes in the human genome
    Clamp, Michele
    Fry, Ben
    Kamal, Mike
    Xie, Xiaohui
    Cuff, James
    Lin, Michael F.
    Kellis, Manolis
    Lindblad-Toh, Kerstin
    Lander, Eric S.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (49) : 19428 - 19433
  • [8] Stem cell transcriptome profiling via massive-scale mRNA sequencing
    Cloonan, Nicole
    Forrest, Alistair R. R.
    Kolle, Gabriel
    Gardiner, Brooke B. A.
    Faulkner, Geoffrey J.
    Brown, Mellissa K.
    Taylor, Darrin F.
    Steptoe, Anita L.
    Wani, Shivangi
    Bethel, Graeme
    Robertson, Alan J.
    Perkins, Andrew C.
    Bruce, Stephen J.
    Lee, Clarence C.
    Ranade, Swati S.
    Peckham, Heather E.
    Manning, Jonathan M.
    McKernan, Kevin J.
    Grimmond, Sean M.
    [J]. NATURE METHODS, 2008, 5 (07) : 613 - 619
  • [9] Finishing the euchromatic sequence of the human genome
    Collins, FS
    Lander, ES
    Rogers, J
    Waterston, RH
    [J]. NATURE, 2004, 431 (7011) : 931 - 945
  • [10] Prominent use of distal 5′ transcription start sites and discovery of a large number of additional exons in ENCODE regions
    Denoeud, France
    Kapranov, Philipp
    Ucla, Catherine
    Frankish, Adam
    Castelo, Robert
    Drenkow, Jorg
    Lagarde, Julien
    Alioto, Tyler
    Manzano, Caroline
    Chrast, Jacqueline
    Dike, Sujit
    Wyss, Carine
    Henrichsen, Charlotte N.
    Holroyd, Nancy
    Dickson, Mark C.
    Taylor, Ruth
    Hance, Zahra
    Foissac, Sylvain
    Myers, Richard M.
    Rogers, Jane
    Hubbard, Tim
    Harrow, Jennifer
    Guigo, Roderic
    Gingeras, Thomas R.
    Antonarakis, Stylianos E.
    Reymond, Alexandre
    [J]. GENOME RESEARCH, 2007, 17 (06) : 746 - 759