Fast and reliable prediction of noncoding RNAs

被引:470
作者
Washietl, S
Hofacker, IL
Stadler, PF
机构
[1] Univ Vienna, Dept Theoret Chem & Struct Biol, A-1090 Vienna, Austria
[2] Univ Leipzig, Dept Comp Sci, Bioinformat Grp, D-04107 Leipzig, Germany
[3] Univ Leipzig, Interdisciplinary Ctr Bioinformat, D-04107 Leipzig, Germany
关键词
comparative genomics; conserved RNA secondary structure;
D O I
10.1073/pnas.0409169102
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
We report an efficient method for detecting functional RNAs. The approach, which combines comparative sequence analysis and structure prediction, already has yielded excellent results for a small number of aligned sequences and is suitable for large-scale genomic screens. It consists of two basic components: (i) a measure for RNA secondary structure conservation based on computing a consensus secondary structure, and (h) a measure for thermodynamic stability, which, in the spirit of a z score, is normalized with respect to both sequence length and base composition but can be calculated without sampling from shuffled sequences. Functional RNA secondary structures can be identified in multiple sequence alignments with high sensitivity and high specificity. We demonstrate that this approach is not only much more accurate than previous methods but also significantly faster. The method is implemented in the program RNAZ, which can be downloaded from www.tbi.univie.ac.at/similar towash/RNAz. We screened all alignments of length n greater than or equal to 50 in the Comparative Regulatory Genomics database, which compiles conserved noncoding elements in upstream regions of orthologous genes from human, mouse, rat, Fugu, and zebrafish. We recovered all of the known noncoding RNAs and cis-acting elements with high significance and found compelling evidence for many other conserved RNA secondary structures not described so far to our knowledge.
引用
收藏
页码:2454 / 2459
页数:6
相关论文
共 47 条
  • [1] [Anonymous], 1998, SCIENCE, V282, P2012
  • [2] X-chromosome inactivation: Counting, choice and initiation
    Avner, P
    Heard, E
    [J]. NATURE REVIEWS GENETICS, 2001, 2 (01) : 59 - 67
  • [3] Ultraconserved elements in the human genome
    Bejerano, G
    Pheasant, M
    Makunin, I
    Stephen, S
    Kent, WJ
    Mattick, JS
    Haussler, D
    [J]. SCIENCE, 2004, 304 (5675) : 1321 - 1325
  • [4] Into the heart of darkness: large-scale clustering of human non-coding DNA
    Bejerano, Gill
    Haussler, David
    Blanchette, Mathieu
    [J]. BIOINFORMATICS, 2004, 20 : 40 - 48
  • [5] Aligning multiple genomic sequences with the threaded blockset aligner
    Blanchette, M
    Kent, WJ
    Riemer, C
    Elnitski, L
    Smit, AFA
    Roskin, KM
    Baertsch, R
    Rosenbloom, K
    Clawson, H
    Green, ED
    Haussler, D
    Miller, W
    [J]. GENOME RESEARCH, 2004, 14 (04) : 708 - 715
  • [6] Evidence that microRNA precursors, unlike other non-coding RNAs, have lower folding free energies than random sequences
    Bonnet, E
    Wuyts, J
    Rouzé, P
    Van de Peer, Y
    [J]. BIOINFORMATICS, 2004, 20 (17) : 2911 - 2917
  • [7] The Ribonuclease P Database
    Brown, JW
    [J]. NUCLEIC ACIDS RESEARCH, 1999, 27 (01) : 314 - 314
  • [8] Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs
    Cawley, S
    Bekiranov, S
    Ng, HH
    Kapranov, P
    Sekinger, EA
    Kampa, D
    Piccolboni, A
    Sementchenko, V
    Cheng, J
    Williams, AJ
    Wheeler, R
    Wong, B
    Drenkow, J
    Yamanaka, M
    Patel, S
    Brubaker, S
    Tammana, H
    Helt, G
    Struhl, K
    Gingeras, TR
    [J]. CELL, 2004, 116 (04) : 499 - 509
  • [9] MSARI: Multiple sequence alignments for statistical detection of RNA secondary structure
    Coventry, A
    Kleitman, DJ
    Berger, B
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (33) : 12102 - 12107
  • [10] Cristianini N., 2000, Intelligent Data Analysis: An Introduction, DOI 10.1017/CBO9780511801389