GC composition of the human genome: In search of isochores

被引:69
作者
Cohen, N
Dagan, T
Stone, L
Graur, D [1 ]
机构
[1] Univ Leeds, Sch Comp, Leeds, W Yorkshire, England
[2] Tel Aviv Univ, George S Wise Fac Life Sci, Dept Zool, Ramat Aviv, Israel
[3] Univ Houston, Dept Biol & Biochem, Houston, TX 77004 USA
关键词
isochores; GC content; human genome; Jensen-Shannon; entropic divergence; genome organization;
D O I
10.1093/molbev/msi115
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The isochore theory, proposed nearly three decades ago, depicts the mammalian genome as a mosaic of long, fairly homogeneous genomic regions that are characterized by their guanine and cytosine (GC) content. The human genome, for instance, was claimed to consist of five distinct isochore families: L1, L2, H1, H2, and H3, with GC contents of < 37%, 37%-42%, 42%-47%, 47%-52%, and > 52%, respectively. In this paper, we address the question of the validity of the isochore theory through a rigorous sequence-based analysis of the human genome. Toward this end, we adopt a set of six attributes that are generally claimed to characterize isochores and statistically test their veracity against the available draft sequence of the complete human genome. By the selection criteria used in this study: distinctiveness, homogeneity, and minimal length of 300 kb, we identify 1,857 genomic segments that warrant the label "isochore." These putative isochores are nonuniformly scattered throughout the genome and cover about 41% of the human genome. We found that a four-family model of putative isochores is the most parsimonious multi-Gaussian model that can befitted to the empirical data. These families, however, are GC poor, with mean GC contents of 35%, 38%, 41%, and 48% and do not resemble the five isochore families in the literature. Moreover, due to large overlaps among the families, it is impossible to classify genomic segments into isochore families reliably, according to compositional properties alone. These findings undermine the utility of the isochore theory and seem to indicate that the theory may have reached the limits of its usefulness as a description of genomic compositional structures.
引用
收藏
页码:1260 / 1272
页数:13
相关论文
共 49 条
[1]   Isochores, GC3 and mutation biases in the human genome [J].
Alvarez-Valin, F ;
Lamolle, G ;
Bernardi, G .
GENE, 2002, 300 (1-2) :161-168
[2]   Compositional segmentation and long-range fractal correlations in DNA sequences [J].
BernaolaGalvan, P ;
RomanRoldan, R ;
Oliver, JL .
PHYSICAL REVIEW E, 1996, 53 (05) :5181-5189
[3]   Isochores and the evolutionary genomics of vertebrates [J].
Bernardi, G .
GENE, 2000, 241 (01) :3-17
[4]   Misunderstandings about isochores. Part 1 [J].
Bernardi, G .
GENE, 2001, 276 (1-2) :3-13
[5]   THE MOSAIC GENOME OF WARM-BLOODED VERTEBRATES [J].
BERNARDI, G ;
OLOFSSON, B ;
FILIPSKI, J ;
ZERIAL, M ;
SALINAS, J ;
CUNY, G ;
MEUNIERROTIVAL, M ;
RODIER, F .
SCIENCE, 1985, 228 (4702) :953-958
[6]   CODON USAGE AND GENOME COMPOSITION [J].
BERNARDI, G ;
BERNARDI, G .
JOURNAL OF MOLECULAR EVOLUTION, 1985, 22 (04) :363-365
[7]   Analysis of DNA sequences using methods of statistical physics [J].
Buldyrev, SV ;
Dokholyan, NV ;
Goldberger, AL ;
Havlin, S ;
Peng, CK ;
Stanley, HE ;
Viswanathan, GM .
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 1998, 249 (1-4) :430-438
[8]   Methylation patterns in the isochores of vertebrate genomes [J].
Cacciò, S ;
Jabbari, K ;
Matassi, G ;
Guermonprez, F ;
Desgrès, J ;
Bernardi, G .
GENE, 1997, 205 (1-2) :119-124
[9]   Compositional heterogeneity within and among isochores in mammalian genomes - II. Some general comments [J].
Clay, O ;
Bernardi, G .
GENE, 2001, 276 (1-2) :25-31
[10]   Compositional heterogeneity within and among isochores in mammalian genomes - I. CsCl and sequence analyses [J].
Clay, O ;
Carels, N ;
Douady, C ;
Macaya, G ;
Bernardi, G .
GENE, 2001, 276 (1-2) :15-24