Position specific variation in the rate of evolution in transcription factor binding sites

被引:121
作者
Moses, AM
Chiang, DY
Kellis, M
Lander, ES
Eisen, MB [1 ]
机构
[1] Univ Calif Berkeley, Grad Grp Biophys, Berkeley, CA 94720 USA
[2] Univ Calif Berkeley, Dept Mol & Cell Biol, Berkeley, CA 94720 USA
[3] MIT, Dept Comp Sci, Cambridge, MA 02139 USA
[4] MIT, Dept Biol, Cambridge, MA 02139 USA
[5] Whitehead MIT Ctr Genome Res, Cambridge, MA 02139 USA
[6] Ernest Orlando Lawrence Berkeley Natl Lab Berkele, Div Life Sci, Dept Genome Sci, Berkeley, CA 94720 USA
关键词
D O I
10.1186/1471-2148-3-19
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: The binding sites of sequence specific transcription factors are an important and relatively well-understood class of functional non-coding DNAs. Although a wide variety of experimental and computational methods have been developed to characterize transcription factor binding sites, they remain difficult to identify. Comparison of non-coding DNA from related species has shown considerable promise in identifying these functional non-coding sequences, even though relatively little is known about their evolution. Results: Here we analyse the genome sequences of the budding yeasts Saccharomyces cerevisiae, S. bayanus, S. paradoxus and S. mikatae to study the evolution of transcription factor binding sites. As expected, we find that both experimentally characterized and computationally predicted binding sites evolve slower than surrounding sequence, consistent with the hypothesis that they are under purifying selection. We also observe position-specific variation in the rate of evolution within binding sites. We find that the position-specific rate of evolution is positively correlated with degeneracy among binding sites within S. cerevisiae. We test theoretical predictions for the rate of evolution at positions where the base frequencies deviate from background due to purifying selection and find reasonable agreement with the observed rates of evolution. Finally, we show how the evolutionary characteristics of real binding motifs can be used to distinguish them from artefacts of computational motif finding algorithms. Conclusion: As has been observed for protein sequences, the rate of evolution in transcription factor binding sites varies with position, suggesting that some regions are under stronger functional constraint than others. This variation likely reflects the varying importance of different positions in the formation of the protein-DNA complex. The characterization of the pattern of evolution in known binding sites will likely contribute to the effective use of comparative sequence data in the identification of transcription factor binding sites and is an important step toward understanding the evolution of functional non-coding DNA.
引用
收藏
页数:13
相关论文
共 48 条
  • [1] DNA-binding specificity of Mcm1: Operator mutations that alter DNA-bending and transcriptional activities by a MADS box protein
    Acton, TB
    Zhong, HL
    Vershon, AK
    [J]. MOLECULAR AND CELLULAR BIOLOGY, 1997, 17 (04) : 1881 - 1889
  • [2] Is there a code for protein-DNA recognition? Probab(ilistical)ly ...
    Benos, PV
    Lapedes, AS
    Stormo, GD
    [J]. BIOESSAYS, 2002, 24 (05) : 466 - 475
  • [3] The Protein Data Bank
    Berman, HM
    Westbrook, J
    Feng, Z
    Gilliland, G
    Bhat, TN
    Weissig, H
    Shindyalov, IN
    Bourne, PE
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 235 - 242
  • [4] Algorithms for phylogenetic footprinting
    Blanchette, M
    Schwikowski, B
    Tompa, M
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2002, 9 (02) : 211 - 223
  • [5] Surveying Saccharomyces genomes to identify functional elements by comparative DNA sequence analysis
    Cliften, PF
    Hillier, LW
    Fulton, L
    Graves, T
    Miner, T
    Gish, WR
    Waterston, RH
    Johnston, M
    [J]. GENOME RESEARCH, 2001, 11 (07) : 1175 - 1186
  • [6] Evolution of transcription factor binding sites in mammalian gene regulatory regions: Conservation and turnover
    Dermitzakis, ET
    Clark, AG
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2002, 19 (07) : 1114 - 1121
  • [7] Durbin R., 1998, BIOL SEQUENCE ANAL P
  • [8] Cluster analysis and display of genome-wide expression patterns
    Eisen, MB
    Spellman, PT
    Brown, PO
    Botstein, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) : 14863 - 14868
  • [9] Distinguishing regulatory DNA from neutral sites
    Elnitski, L
    Hardison, RC
    Li, J
    Yang, S
    Kolbe, D
    Eswara, P
    O'Connor, MJ
    Schwartz, S
    Miller, W
    Chiaromonte, F
    [J]. GENOME RESEARCH, 2003, 13 (01) : 64 - 72
  • [10] Eskin Eleazar, 2002, Bioinformatics, V18 Suppl 1, pS354