Distribution and intensity of constraint in mammalian genomic sequence

被引:1022
作者
Cooper, GM
Stone, EA
Asimenos, G
Green, ED
Batzoglou, S
Sidow, A [1 ]
机构
[1] Stanford Univ, Dept Genet, Stanford, CA 94305 USA
[2] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
[3] Stanford Univ, Dept Pathol, Stanford, CA 94305 USA
[4] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
[5] NHGRI, Genome Technol Branch, NIH, Bethesda, MD 20892 USA
[6] NHGRI, NISC, NIH, Bethesda, MD 20892 USA
关键词
D O I
10.1101/gr.3577405
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Comparisons of orthologous genomic DNA sequences call be used to characterize regions that have been subject to purifying selection and are enriched for functional elements. We here present the results of such ail analysis on ail alignment of sequences from 29 mammalian species. The alignment captures -3.9 neutral Substitutions per site and spans similar to 1.9 Mbp of the human genome. We identify constrained elements from 3 bp to over 1 kbp in length, covering similar to 5.5% of the human locus. Our estimate for the total amount of nonexonic constraint experienced by this locus is roughly twice that for exonic constraint. Constrained elements tend to Cluster, and we identify large constrained regions that correspond well with known functional elements. While constraint density inversely correlates with mobile element density, we also show the presence of unambiguously constrained elements overlapping mammalian ancestral repeats. In addition, we describe a number of elements in this region that have undergone intense purifying selection throughout mammalian evolution, and we show that these important elements are more numerous than previously thought. These results were obtained with Genomic Evolutionary Rate Profiling (GERP), a statistically rigorous and biologically transparent framework for constrained element identification. CERP identifies regions at high resolution that exhibit nucleotide substitution deficits, and measures these deficits as "rejected substitutions." Rejected substitutions reflect the intensity of past Purifying selection and are used to rank and characterize constrained elements. We anticipate that GERP and the types of analyses it facilitates will provide further insights and improved annotation for the human genome as mammalian genome sequence data become richer.
引用
收藏
页码:901 / 913
页数:13
相关论文
共 64 条
  • [1] Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes
    Aparicio, S
    Chapman, J
    Stupka, E
    Putnam, N
    Chia, J
    Dehal, P
    Christoffels, A
    Rash, S
    Hoon, S
    Smit, A
    Gelpke, MDS
    Roach, J
    Oh, T
    Ho, IY
    Wong, M
    Detter, C
    Verhoef, F
    Predki, P
    Tay, A
    Lucas, S
    Richardson, P
    Smith, SF
    Clark, MS
    Edwards, YJK
    Doggett, N
    Zharkikh, A
    Tavtigian, SV
    Pruss, D
    Barnstead, M
    Evans, C
    Baden, H
    Powell, J
    Glusman, G
    Rowen, L
    Hood, L
    Tan, YH
    Elgar, G
    Hawkins, T
    Venkatesh, B
    Rokhsar, D
    Brenner, S
    [J]. SCIENCE, 2002, 297 (5585) : 1301 - 1310
  • [2] Arnone MI, 1997, DEVELOPMENT, V124, P1851
  • [3] Ultraconserved elements in the human genome
    Bejerano, G
    Pheasant, M
    Makunin, I
    Stephen, S
    Kent, WJ
    Mattick, JS
    Haussler, D
    [J]. SCIENCE, 2004, 304 (5675) : 1321 - 1325
  • [4] Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome
    Berman, BP
    Nibu, Y
    Pfeiffer, BD
    Tomancak, P
    Celniker, SE
    Levine, M
    Rubin, GM
    Eisen, MB
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (02) : 757 - 762
  • [5] An intermediate grade of finished genomic sequence suitable for comparative analyses
    Blakesley, RW
    Hansen, NF
    Mullikin, JC
    Thomas, PJ
    McDowell, JC
    Maskeri, B
    Young, AC
    Benjamin, B
    Brooks, SY
    Coleman, BI
    Gupta, J
    Ho, SL
    Karlins, EM
    Maduro, QL
    Stantripop, S
    Tsurgeon, C
    Vogt, JL
    Walker, MA
    Masiello, CA
    Guan, XB
    Bouffared, GG
    Green, ED
    [J]. GENOME RESEARCH, 2004, 14 (11) : 2235 - 2244
  • [6] Phylogenetic shadowing of primate sequences to find functional regions of the human genome
    Boffelli, D
    McAuliffe, J
    Ovcharenko, D
    Lewis, KD
    Ovcharenko, I
    Pachter, L
    Rubin, EM
    [J]. SCIENCE, 2003, 299 (5611) : 1391 - 1394
  • [7] Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease
    Botstein, D
    Risch, N
    [J]. NATURE GENETICS, 2003, 33 (Suppl 3) : 228 - 237
  • [8] Mobile elements inserted in the distant past have taken on important functions
    Britten, RJ
    [J]. GENE, 1997, 205 (1-2) : 177 - 182
  • [9] LAGAN and Multi-LAGAN: Efficient tools for large-scale multiple alignment of genomic DNA
    Brudno, M
    Do, CB
    Cooper, GM
    Kim, MF
    Davydov, E
    Green, ED
    Sidow, A
    Batzoglou, S
    [J]. GENOME RESEARCH, 2003, 13 (04) : 721 - 731
  • [10] Automated whole-genome multiple alignment of rat, mouse, and human
    Brudno, M
    Poliakov, A
    Salamov, A
    Cooper, GM
    Sidow, A
    Rubin, EM
    Solovyev, V
    Batzoglou, S
    Dubchak, I
    [J]. GENOME RESEARCH, 2004, 14 (04) : 685 - 692