Long- and short-range correlations in genome organization

被引:45
作者
Almirantis, Y [1 ]
Provata, A
机构
[1] NRCPS Demokritos, Inst Biol, Athens 15310, Greece
[2] NRCPS Demokritos, Inst Phys Chem, Athens 15310, Greece
关键词
power law distributions; long-range correlations; coding/non-coding DNA sequences; DNA strand partition;
D O I
10.1023/A:1004671119400
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
We study the size distribution of coding and non-coding regions in DNA sequences. For most organisms we observe that the size distribution P-c(S) of the coding regions of size S shows short range distribution, whereas the size distribution of the non-coding regions follows a power-law decay P-nc(S)similar to S-1-mu with power exponents indicating clear long-range behavior. We argue, using the Generalized Central Limit Theorem, that the long-range distributions observed in the non-coding are related to the lower level clustering of purines and pyrimidines (1d islands) which follow similar long-range laws. We also address the question of clustering of coding segments in the two complementary strands of DNA. We observe a short-range clustering of coding regions in both strands, expressed by an exponential decay in the clustering size distribution. The decay exponent expresses the degree of short-range correlations and the deviation from random clustering.
引用
收藏
页码:233 / 262
页数:30
相关论文
共 26 条
[1]  
Alberts B., 1994, MOL BIOL CELL
[2]   A standard deviation based quantification differentiates coding from non-coding DNA sequences and gives insight to their evolutionary history [J].
Almirantis, Y .
JOURNAL OF THEORETICAL BIOLOGY, 1999, 196 (03) :297-308
[3]  
Almirantis Y, 1997, B MATH BIOL, V59, P975
[4]  
ALMIRANTIS Y, 1993, P EUR C ART LIF BRUS, P9
[5]  
[Anonymous], GENES
[6]  
[Anonymous], 2018, INTRO PERCOLATION TH
[7]   CHARACTERIZATION OF THE PUFFERFISH (FUGU) GENOME AS A COMPACT MODEL VERTEBRATE GENOME [J].
BRENNER, S ;
ELGAR, G ;
SANDFORD, R ;
MACRAE, A ;
VENKATESH, B ;
APARICIO, S .
NATURE, 1993, 366 (6452) :265-268
[8]   GENERALIZED LEVY-WALK MODEL FOR DNA NUCLEOTIDE-SEQUENCES [J].
BULDYREV, SV ;
GOLDBERGER, AL ;
HAVLIN, S ;
PENG, CK ;
SIMONS, M ;
STANLEY, HE .
PHYSICAL REVIEW E, 1993, 47 (06) :4514-4523
[9]   Lack of biological significance in the 'linguistic features' of noncoding DNA-a quantitative analysis [J].
ChatzidimitriouDreismann, CA ;
Streffer, RMF ;
Larhammar, D .
NUCLEIC ACIDS RESEARCH, 1996, 24 (09) :1676-1681
[10]   CORRELATIONS IN BINARY SEQUENCES AND A GENERALIZED ZIPF ANALYSIS [J].
CZIROK, A ;
MANTEGNA, RN ;
HAVLIN, S ;
STANLEY, HE .
PHYSICAL REVIEW E, 1995, 52 (01) :446-452