Identification and classification of conserved RNA secondary structures in the human genome

被引:374
作者
Pedersen, Jakob Skou [1 ]
Bejerano, Gill
Siepel, Adam
Rosenbloom, Kate
Lindblad-Toh, Kerstin
Lander, Eric S.
Kent, Jim
Miller, Webb
Haussler, David
机构
[1] Univ Calif Santa Cruz, Ctr Biomol Sci & Engn, Santa Cruz, CA 95064 USA
[2] MIT, Broad Inst, Cambridge, MA 02139 USA
[3] Harvard Univ, Cambridge, MA 02138 USA
[4] Penn State Univ, Ctr Comparat Genom & Bioinformat, University Pk, PA 16802 USA
[5] Univ Calif Santa Cruz, Howard Hughes Med Inst, Santa Cruz, CA 95064 USA
关键词
D O I
10.1371/journal.pcbi.0020033
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The discoveries of microRNAs and riboswitches, among others, have shown functional RNAs to be biologically more important and genomically more prevalent than previously anticipated. We have developed a general comparative genomics method based on phylogenetic stochastic context-free grammars for identifying functional RNAs encoded in the human genome and used it to survey an eight-way genome-wide alignment of the human, chimpanzee, mouse, rat, dog, chicken, zebra-fish, and puffer-fish genomes for deeply conserved functional RNAs. At a loose threshold for acceptance, this search resulted in a set of 48,479 candidate RNA structures. This screen finds a large number of known functional RNAs, including 195 miRNAs, 62 histone 3'UTR stem loops, and various types of known genetic recoding elements. Among the highest-scoring new predictions are 169 new miRNA candidates, as well as new candidate selenocysteine insertion sites, RNA editing hairpins, RNAs involved in transcript auto regulation, and many folds that form singletons or small functional RNA families of completely unknown function. While the rate of false positives in the overall set is difficult to estimate and is likely to be substantial, the results nevertheless provide evidence for many new human functional RNAs and present specific predictions to facilitate their further characterization.
引用
收藏
页码:251 / 262
页数:12
相关论文
共 67 条
[1]   NSD3, a new SET domain-containing gene, maps to 8p12 and is amplified in human breast cancer cell lines [J].
Angrand, PO ;
Apiou, F ;
Stewart, AF ;
Dutrillaux, B ;
Losson, R ;
Chambon, P .
GENOMICS, 2001, 74 (01) :79-88
[2]   Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes [J].
Aparicio, S ;
Chapman, J ;
Stupka, E ;
Putnam, N ;
Chia, J ;
Dehal, P ;
Christoffels, A ;
Rash, S ;
Hoon, S ;
Smit, A ;
Gelpke, MDS ;
Roach, J ;
Oh, T ;
Ho, IY ;
Wong, M ;
Detter, C ;
Verhoef, F ;
Predki, P ;
Tay, A ;
Lucas, S ;
Richardson, P ;
Smith, SF ;
Clark, MS ;
Edwards, YJK ;
Doggett, N ;
Zharkikh, A ;
Tavtigian, SV ;
Pruss, D ;
Barnstead, M ;
Evans, C ;
Baden, H ;
Powell, J ;
Glusman, G ;
Rowen, L ;
Hood, L ;
Tan, YH ;
Elgar, G ;
Hawkins, T ;
Venkatesh, B ;
Rokhsar, D ;
Brenner, S .
SCIENCE, 2002, 297 (5585) :1301-1310
[3]   A systematic search for new mammalian noncoding RNAs indicates little conserved intergenic transcription [J].
Babak, T ;
Blencowe, BJ ;
Hughes, TR .
BMC GENOMICS, 2005, 6 (1)
[4]   Into the heart of darkness: large-scale clustering of human non-coding DNA [J].
Bejerano, Gill ;
Haussler, David ;
Blanchette, Mathieu .
BIOINFORMATICS, 2004, 20 :40-48
[5]   Identification of hundreds of conserved and nonconserved human microRNAs [J].
Bentwich, I ;
Avniel, A ;
Karov, Y ;
Aharonov, R ;
Gilad, S ;
Barad, O ;
Barzilai, A ;
Einat, P ;
Einav, U ;
Meiri, E ;
Sharon, E ;
Spector, Y ;
Bentwich, Z .
NATURE GENETICS, 2005, 37 (07) :766-770
[6]   Phylogenetic shadowing and computational identification of human microRNA genes [J].
Berezikov, E ;
Guryev, V ;
van de Belt, J ;
Wienholds, E ;
Plasterk, RHA ;
Cuppen, E .
CELL, 2005, 120 (01) :21-24
[7]   RECOGNITION OF UGA AS A SELENOCYSTEINE CODON IN TYPE-I DEIODINASE REQUIRES SEQUENCES IN THE 3' UNTRANSLATED REGION [J].
BERRY, MJ ;
BANU, L ;
CHEN, Y ;
MANDEL, SJ ;
KIEFFER, JD ;
HARNEY, JW ;
LARSEN, PR .
NATURE, 1991, 353 (6341) :273-276
[8]   Aligning multiple genomic sequences with the threaded blockset aligner [J].
Blanchette, M ;
Kent, WJ ;
Riemer, C ;
Elnitski, L ;
Smit, AFA ;
Roskin, KM ;
Baertsch, R ;
Rosenbloom, K ;
Clawson, H ;
Green, ED ;
Haussler, D ;
Miller, W .
GENOME RESEARCH, 2004, 14 (04) :708-715
[9]   The contribution of RNAs and retroposition to evolutionary novelties [J].
Brosius, J .
GENETICA, 2003, 118 (2-3) :99-116
[10]   LAGAN and Multi-LAGAN: Efficient tools for large-scale multiple alignment of genomic DNA [J].
Brudno, M ;
Do, CB ;
Cooper, GM ;
Kim, MF ;
Davydov, E ;
Green, ED ;
Sidow, A ;
Batzoglou, S .
GENOME RESEARCH, 2003, 13 (04) :721-731