Searching for RNA genes using base-composition statistics

被引:79
作者
Schattner, P [1 ]
机构
[1] Univ Calif Santa Cruz, Ctr Biomol Sci & Engn, Sinsheimer Labs 227, Santa Cruz, CA 95064 USA
关键词
D O I
10.1093/nar/30.9.2076
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The hypothesis that genomic regions rich in non-protein-coding RNAs (ncRNAs) can be identified using local variations in single-base and dinucleotide statistics has been investigated. (G+C)%, (G-C)% difference, (A-T)% difference and dinucleotide-frequency statistics were compared among seven classes of ncRNAs and three genomes. Significant variations were observed in (G+C)% and, in Methanococcus jannaschii, in the frequency of the dinucleotide 'CG'. Screening programs based on these two base-composition statistics were developed. With (G+C)% screening alone, a 1% fraction of the M.jannaschii genome containing all 44 known transfer RNAs, ribosomal RNAs and signal recognition particle RNAs could be identified. When (G+C)% combined with CG dinucleotide-frequency screening was used, 43 of the 44 known M.jannaschii structural ncRNAs were again identified, while the number of presumably false hits overlapping a known or putative protein-coding gene was reduced from 15 to 6. In addition, 19 candidate ncRNAs were identified including one with significant homology to several known archaeal RNaseP RNAs.
引用
收藏
页码:2076 / 2082
页数:7
相关论文
共 32 条
[1]  
ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
[2]   Novel small RNA-encoding genes in the intergenic regions of Escherichia coli [J].
Argaman, L ;
Hershberg, R ;
Vogel, J ;
Bejerano, G ;
Wagner, EGH ;
Margalit, H ;
Altuvia, S .
CURRENT BIOLOGY, 2001, 11 (12) :941-950
[3]   Accounting units in DNA [J].
Bell, SJ ;
Forsdyke, DR .
JOURNAL OF THEORETICAL BIOLOGY, 1999, 197 (01) :51-61
[4]   Isochores and the evolutionary genomics of vertebrates [J].
Bernardi, G .
GENE, 2000, 241 (01) :3-17
[5]   Noncoding RNA genes [J].
Eddy, SR .
CURRENT OPINION IN GENETICS & DEVELOPMENT, 1999, 9 (06) :695-699
[6]   RNA SEQUENCE-ANALYSIS USING COVARIANCE-MODELS [J].
EDDY, SR ;
DURBIN, R .
NUCLEIC ACIDS RESEARCH, 1994, 22 (11) :2079-2088
[7]   The non-coding RNAs as riboregulators [J].
Erdmann, VA ;
Barciszewska, MZ ;
Szymanski, M ;
Hochberg, A ;
de Groot, N ;
Barciszewski, J .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :189-193
[8]   IDENTIFYING POTENTIAL TRANSFER-RNA GENES IN GENOMIC DNA-SEQUENCES [J].
FICHANT, GA ;
BURKS, C .
JOURNAL OF MOLECULAR BIOLOGY, 1991, 220 (03) :659-671
[9]   IMPROVED FREE-ENERGY PARAMETERS FOR PREDICTIONS OF RNA DUPLEX STABILITY [J].
FREIER, SM ;
KIERZEK, R ;
JAEGER, JA ;
SUGIMOTO, N ;
CARUTHERS, MH ;
NEILSON, T ;
TURNER, DH .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1986, 83 (24) :9373-9377
[10]   DNA SUPERCOILING IN A THERMOTOLERANT MUTANT OF ESCHERICHIA-COLI [J].
FRIEDMAN, SM ;
MALIK, M ;
DRLICA, K .
MOLECULAR & GENERAL GENETICS, 1995, 248 (04) :417-422