Differentiating Protein-Coding and Noncoding RNA: Challenges and Ambiguities

被引:404
作者
Dinger, Marcel E. [1 ]
Pang, Ken C. [2 ]
Mercer, Tim R. [1 ]
Mattick, John S. [1 ]
机构
[1] Univ Queensland, ARC Special Res Ctr Funct & Appl Genom, Inst Mol Biosci, St Lucia, Qld, Australia
[2] Melbourne Ctr Clin Sci, Ludwig Inst Canc Res, Cell Lab 2T, Heidelberg, Vic, Australia
基金
澳大利亚研究理事会;
关键词
D O I
10.1371/journal.pcbi.1000176
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The assumption that RNA can be readily classified into either protein-coding or non-protein coding categories has pervaded biology for close to 50 years. Until recently, discrimination between these two categories was relatively straightforward: most transcripts were clearly identifiable as protein-coding messenger RNAs (mRNAs), and readily distinguished from the small number of well-characterized non-protein-coding RNAs (ncRNAs), such as transfer, ribosomal, and spliceosomal RNAs. Recent genome-wide studies have revealed the existence of thousands of noncoding transcripts, whose function and significance are unclear. The discovery of this hidden transcriptome and the implicit challenge it presents to our understanding of the expression and regulation of genetic information has made the need to distinguish between mRNAs and ncRNAs both more pressing and more complicated. In this Review, we consider the diverse strategies employed to discriminate between protein-coding and noncoding transcripts and the fundamental difficulties that are inherent in what may superficially appear to be a simple problem. Misannotations can also run in both directions: some ncRNAs may actually encode peptides, and some of those currently thought to do so may not. Moreover, recent studies have shown that some RNAs can function both as mRNAs and intrinsically as functional ncRNAs, which may be a relatively widespread phenomenon. We conclude that it is difficult to annotate an RNA unequivocally as protein-coding or noncoding, with overlapping protein-coding and noncoding transcripts further confounding this distinction. In addition, the finding that some transcripts can function both intrinsically at the RNA level and to encode proteins suggests a false dichotomy between mRNAs and ncRNAs. Therefore, the functionality of any transcript at the RNA level should not be discounted.
引用
收藏
页数:5
相关论文
共 72 条
[1]   Evolution of microRNA genes by inverted duplication of target gene sequences in Arabidopsis thaliana [J].
Allen, E ;
Xie, ZX ;
Gustafson, AM ;
Sung, GH ;
Spatafora, JW ;
Carrington, JC .
NATURE GENETICS, 2004, 36 (12) :1282-1290
[2]   Cis and trans effects of the myotonic dystrophy (DM) mutation in a cell culture model [J].
Amack, JD ;
Paguio, AP ;
Mahadevan, MS .
HUMAN MOLECULAR GENETICS, 1999, 8 (11) :1975-1984
[3]   The eukaryotic genome as an RNA machine [J].
Amaral, Paulo P. ;
Dinger, Marcel E. ;
Mercer, Tim R. ;
Mattick, John S. .
SCIENCE, 2008, 319 (5871) :1787-1789
[4]   CRITICA: Coding region identification tool invoking comparative analysis [J].
Badger, JH ;
Olsen, GJ .
MOLECULAR BIOLOGY AND EVOLUTION, 1999, 16 (04) :512-524
[5]   Global identification of human transcribed sequences with genome tiling arrays [J].
Bertone, P ;
Stolc, V ;
Royce, TE ;
Rozowsky, JS ;
Urban, AE ;
Zhu, XW ;
Rinn, JL ;
Tongprasit, W ;
Samanta, M ;
Weissman, S ;
Gerstein, M ;
Snyder, M .
SCIENCE, 2004, 306 (5705) :2242-2246
[6]   Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project [J].
Birney, Ewan ;
Stamatoyannopoulos, John A. ;
Dutta, Anindya ;
Guigo, Roderic ;
Gingeras, Thomas R. ;
Margulies, Elliott H. ;
Weng, Zhiping ;
Snyder, Michael ;
Dermitzakis, Emmanouil T. ;
Stamatoyannopoulos, John A. ;
Thurman, Robert E. ;
Kuehn, Michael S. ;
Taylor, Christopher M. ;
Neph, Shane ;
Koch, Christoph M. ;
Asthana, Saurabh ;
Malhotra, Ankit ;
Adzhubei, Ivan ;
Greenbaum, Jason A. ;
Andrews, Robert M. ;
Flicek, Paul ;
Boyle, Patrick J. ;
Cao, Hua ;
Carter, Nigel P. ;
Clelland, Gayle K. ;
Davis, Sean ;
Day, Nathan ;
Dhami, Pawandeep ;
Dillon, Shane C. ;
Dorschner, Michael O. ;
Fiegler, Heike ;
Giresi, Paul G. ;
Goldy, Jeff ;
Hawrylycz, Michael ;
Haydock, Andrew ;
Humbert, Richard ;
James, Keith D. ;
Johnson, Brett E. ;
Johnson, Ericka M. ;
Frum, Tristan T. ;
Rosenzweig, Elizabeth R. ;
Karnani, Neerja ;
Lee, Kirsten ;
Lefebvre, Gregory C. ;
Navas, Patrick A. ;
Neri, Fidencio ;
Parker, Stephen C. J. ;
Sabo, Peter J. ;
Sandstrom, Richard ;
Shafer, Anthony .
NATURE, 2007, 447 (7146) :799-816
[7]   CHARACTERIZATION OF A MURINE GENE EXPRESSED FROM THE INACTIVE X-CHROMOSOME [J].
BORSANI, G ;
TONLORENZI, R ;
SIMMLER, MC ;
DANDOLO, L ;
ARNAUD, D ;
CAPRA, V ;
GROMPE, M ;
PIZZUTI, A ;
MUZNY, D ;
LAWRENCE, C ;
WILLARD, HF ;
AVNER, P ;
BALLABIO, A .
NATURE, 1991, 351 (6324) :325-329
[8]   THE PRODUCT OF THE MOUSE XIST GENE IS A 15 KB INACTIVE X-SPECIFIC TRANSCRIPT CONTAINING NO CONSERVED ORF AND LOCATED IN THE NUCLEUS [J].
BROCKDORFF, N ;
ASHWORTH, A ;
KAY, GF ;
MCCABE, VM ;
NORRIS, DP ;
COOPER, PJ ;
SWIFT, S ;
RASTAN, S .
CELL, 1992, 71 (03) :515-526
[9]   The transcriptional landscape of the mammalian genome [J].
Carninci, P ;
Kasukawa, T ;
Katayama, S ;
Gough, J ;
Frith, MC ;
Maeda, N ;
Oyama, R ;
Ravasi, T ;
Lenhard, B ;
Wells, C ;
Kodzius, R ;
Shimokawa, K ;
Bajic, VB ;
Brenner, SE ;
Batalov, S ;
Forrest, ARR ;
Zavolan, M ;
Davis, MJ ;
Wilming, LG ;
Aidinis, V ;
Allen, JE ;
Ambesi-Impiombato, X ;
Apweiler, R ;
Aturaliya, RN ;
Bailey, TL ;
Bansal, M ;
Baxter, L ;
Beisel, KW ;
Bersano, T ;
Bono, H ;
Chalk, AM ;
Chiu, KP ;
Choudhary, V ;
Christoffels, A ;
Clutterbuck, DR ;
Crowe, ML ;
Dalla, E ;
Dalrymple, BP ;
de Bono, B ;
Della Gatta, G ;
di Bernardo, D ;
Down, T ;
Engstrom, P ;
Fagiolini, M ;
Faulkner, G ;
Fletcher, CF ;
Fukushima, T ;
Furuno, M ;
Futaki, S ;
Gariboldi, M .
SCIENCE, 2005, 309 (5740) :1559-1563
[10]   CSTminer:: a web tool for the identification of coding and noncoding conserved sequence tags through cross-species genome comparison [J].
Castrignanò, T ;
Canali, A ;
Grillo, G ;
Liuni, S ;
Mignone, F ;
Pesole, G .
NUCLEIC ACIDS RESEARCH, 2004, 32 :W624-W627