Scaling features of noncoding DNA

被引:95
作者
Stanley, HE [1 ]
Buldyrev, SV
Goldberger, AL
Havlin, S
Peng, CK
Simons, M
机构
[1] Boston Univ, Ctr Polymer Studies, Boston, MA 02215 USA
[2] Boston Univ, Dept Phys, Boston, MA 02215 USA
[3] Harvard Univ, Beth Israel Hosp, Sch Med, Div Cardiovasc, Boston, MA 02215 USA
[4] Bar Ilan Univ, Dept Phys, Ramat Gan, Israel
[5] Boston Univ, Dept Biomed Engn, Boston, MA 02215 USA
来源
PHYSICA A | 1999年 / 273卷 / 1-2期
基金
美国国家科学基金会; 美国国家卫生研究院; 美国国家航空航天局;
关键词
D O I
10.1016/S0378-4371(99)00407-0
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
We review evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range - indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene, and utilize this fact to build a Coding Sequence Finder Algorithm, which uses statistical ideas to locate the coding regions of an unknown DNA sequence. Finally, we describe briefly some recent work adapting to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function, and reporting that noncoding regions in eukaryotes display a larger redundancy than coding regions. Specifically, we consider the possibility that this result is solely a consequence of nucleotide concentration differences as first noted by Bonhoeffer and his collaborators. We find that cytosine-guanine (CG) concentration does have a strong "background" effect on redundancy. However, we find that for the purine-pyrimidine binary mapping rule, which is not affected by the difference in CG concentration, the Shannon redundancy for the set of analyzed sequences is larger for noncoding regions compared to coding regions. (C) 1999 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:1 / 18
页数:18
相关论文
共 50 条
[11]   RANDOM 2 COMPONENT 1 DIMENSIONAL ISING-MODEL FOR HETEROPOLYMER MELTING [J].
AZBEL, MY .
PHYSICAL REVIEW LETTERS, 1973, 31 (09) :589-592
[12]   GLOBAL FRACTAL DIMENSION OF HUMAN DNA-SEQUENCES TREATED AS PSEUDORANDOM WALKS [J].
BERTHELSEN, CL ;
GLAZIER, JA ;
SKOLNICK, MH .
PHYSICAL REVIEW A, 1992, 45 (12) :8902-8913
[13]   No signs of hidden language in noncoding DNA [J].
Bonhoeffer, S ;
Herz, AVM ;
Boerlijst, MC ;
Nee, S ;
Nowak, MA ;
May, RM .
PHYSICAL REVIEW LETTERS, 1996, 76 (11) :1977-1977
[14]  
BRILLOUIN L, 1956, SCI INFORMATION THEO
[15]   FRACTAL LANDSCAPES AND MOLECULAR EVOLUTION - MODELING THE MYOSIN HEAVY-CHAIN GENE FAMILY [J].
BULDYREV, SV ;
GOLDBERGER, AL ;
HAVLIN, S ;
PENG, CK ;
STANLEY, HE ;
STANLEY, MHR ;
SIMONS, M .
BIOPHYSICAL JOURNAL, 1993, 65 (06) :2673-2679
[16]   Analysis of DNA sequences using methods of statistical physics [J].
Buldyrev, SV ;
Dokholyan, NV ;
Goldberger, AL ;
Havlin, S ;
Peng, CK ;
Stanley, HE ;
Viswanathan, GM .
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 1998, 249 (1-4) :430-438
[17]   Expansion of tandem repeats and oligomer clustering in coding and noncoding DNA sequences [J].
Buldyrev, SV ;
Dokholyan, NV ;
Havlin, S ;
Stanley, HE ;
Stanley, RHR .
PHYSICA A, 1999, 273 (1-2) :19-32
[18]   LONG-RANGE CORRELATION-PROPERTIES OF CODING AND NONCODING DNA-SEQUENCES - GENBANK ANALYSIS [J].
BULDYREV, SV ;
GOLDBERGER, AL ;
HAVLIN, S ;
MANTEGNA, RN ;
MATSA, ME ;
PENG, CK ;
SIMONS, M ;
STANLEY, HE .
PHYSICAL REVIEW E, 1995, 51 (05) :5084-5091
[19]  
Bunde A., 1991, FRACTALS DISORDERED
[20]   Organization for physiological homeostasis [J].
Cannon, WB .
PHYSIOLOGICAL REVIEWS, 1929, 9 (03) :399-431