CORRELATION APPROACH TO IDENTIFY CODING REGIONS IN DNA-SEQUENCES

被引:150
作者
OSSADNIK, SM
BULDYREV, SV
GOLDBERGER, AL
HAVLIN, S
MANTEGNA, RN
PENG, CK
SIMONS, M
STANLEY, HE
机构
[1] BOSTON UNIV, CTR POLYMER STUDIES, DEPT PHYS, BOSTON, MA 02215 USA
[2] BOSTON UNIV, DEPT PHYS, BOSTON, MA 02215 USA
[3] HARVARD UNIV, BETH ISRAEL HOSP, SCH MED, DIV CARDIOVASC, BOSTON, MA 02215 USA
[4] BAR ILAN UNIV, DEPT PHYS, RAMAT GAN, ISRAEL
[5] MIT, DEPT BIOL, CAMBRIDGE, MA 02139 USA
基金
美国国家航空航天局; 美国国家卫生研究院;
关键词
D O I
10.1016/S0006-3495(94)80455-2
中图分类号
Q6 [生物物理学];
学科分类号
071011 ;
摘要
Recently, it was observed that noncoding regions of DNA sequences possess long-range power-law correlations, whereas coding regions typically display only short-range correlations. We develop an algorithm based on this finding that enables investigators to perform a statistical analysis on long DNA sequences to locate possible coding regions. The algorithm is particularly successful in predicting the location of lengthy coding regions. For example, for the complete genome of yeast chromosome III (315,344 nucleotides), at least 82% of the predictions correspond to putative coding regions; the algorithm correctly identified all coding regions larger than 3000 nucleotides, 92% of coding regions between 2000 and 3000 nucleotides long, and 79% of coding regions between 1000 and 2000 nucleotides. The predictive ability of this new algorithm supports the claim that there is a fundamental difference in the correlation property between coding and noncoding sequences. This algorithm, which is not species-dependent, can be implemented with other techniques for rapidly and accurately locating relatively long coding regions in genomic sequences.
引用
收藏
页码:64 / 70
页数:7
相关论文
共 21 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   GLOBAL FRACTAL DIMENSION OF HUMAN DNA-SEQUENCES TREATED AS PSEUDORANDOM WALKS [J].
BERTHELSEN, CL ;
GLAZIER, JA ;
SKOLNICK, MH .
PHYSICAL REVIEW A, 1992, 45 (12) :8902-8913
[3]   FRACTAL LANDSCAPES AND MOLECULAR EVOLUTION - MODELING THE MYOSIN HEAVY-CHAIN GENE FAMILY [J].
BULDYREV, SV ;
GOLDBERGER, AL ;
HAVLIN, S ;
PENG, CK ;
STANLEY, HE ;
STANLEY, MHR ;
SIMONS, M .
BIOPHYSICAL JOURNAL, 1993, 65 (06) :2673-2679
[4]   GENERALIZED LEVY-WALK MODEL FOR DNA NUCLEOTIDE-SEQUENCES [J].
BULDYREV, SV ;
GOLDBERGER, AL ;
HAVLIN, S ;
PENG, CK ;
SIMONS, M ;
STANLEY, HE .
PHYSICAL REVIEW E, 1993, 47 (06) :4514-4523
[5]   LONG-RANGE CORRELATIONS IN DNA [J].
CHATZIDIMITRIOUDREISMANN, CA ;
LARHAMMAR, D .
NATURE, 1993, 361 (6409) :212-213
[6]   CRUMPLED GLOBULE MODEL OF THE 3-DIMENSIONAL STRUCTURE OF DNA [J].
GROSBERG, A ;
RABIN, Y ;
HAVLIN, S ;
NEER, A .
EUROPHYSICS LETTERS, 1993, 23 (05) :373-378
[7]   PATCHINESS AND CORRELATIONS IN DNA-SEQUENCES [J].
KARLIN, S ;
BRENDEL, V .
SCIENCE, 1993, 259 (5095) :677-680
[8]   LONG-RANGE CORRELATION AND PARTIAL 1/F-ALPHA SPECTRUM IN A NONCODING DNA-SEQUENCE [J].
LI, W ;
KANEKO, K .
EUROPHYSICS LETTERS, 1992, 17 (07) :655-660
[9]  
MONTROLL EW, 1984, NONEQUILIBRIUM PHENO, V2, P1
[10]   DNA CORRELATIONS [J].
MUNSON, PJ ;
TAYLOR, RC ;
MICHAELS, GS .
NATURE, 1992, 360 (6405) :636-636