De novo cis-regulatory module elicitation for eukaryotic genomes

被引:92
作者
Gupta, M [1 ]
Liu, JS
机构
[1] Univ N Carolina, Dept Biostat, Chapel Hill, NC 27599 USA
[2] Harvard Univ, Dept Stat, Cambridge, MA 02138 USA
关键词
evolutionary Monte Carlo; gene regulation; hidden Markov models; transcription factor binding sites;
D O I
10.1073/pnas.0408743102
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Transcription regulation is controlled by coordinated binding of one or more transcription factors in the promoter regions of genes. In many species, especially higher eukaryotes, transcription factor binding sites tend to occur as homotypic or heterotypic clusters, also known as cis-regulatory modules. The number of sites and distances between the sites, however, vary greatly in a module. We propose a statistical model to describe the underlying cluster structure as well as individual motif conservation and develop a Monte Carlo motif screening strategy for predicting novel regulatory modules in upstream sequences of coregulated genes. We demonstrate the power of the method with examples ranging from bacterial to insect and human genomes.
引用
收藏
页码:7079 / 7084
页数:6
相关论文
共 30 条
  • [1] Bailey T., 1994, P 2 INT C INT SYST M, P28
  • [2] Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome
    Berman, BP
    Nibu, Y
    Pfeiffer, BD
    Tomancak, P
    Celniker, SE
    Levine, M
    Rubin, GM
    Eisen, MB
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (02) : 757 - 762
  • [3] Building a dictionary for genomes: Identification of presumptive regulatory sites by statistical analysis
    Bussemaker, HJ
    Li, H
    Siggia, ED
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (18) : 10096 - 10100
  • [4] Regulatory element detection using correlation with expression
    Bussemaker, HJ
    Li, H
    Siggia, ED
    [J]. NATURE GENETICS, 2001, 27 (02) : 167 - 171
  • [5] Integrating regulatory motif discovery and genome-wide expression analysis
    Conlon, EM
    Liu, XS
    Lieb, JD
    Liu, JS
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (06) : 3339 - 3344
  • [6] Davidson E. H., 2001, Genomic regulatory systems: development and evolution
  • [7] Statistical significance of clusters of motifs represented by position specific scoring matrices in nucleotide sequences
    Frith, MC
    Spouge, JL
    Hansen, U
    Weng, ZP
    [J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (14) : 3214 - 3224
  • [8] Discovery of conserved sequence patterns using a stochastic dictionary model
    Gupta, M
    Liu, JS
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2003, 98 (461) : 55 - 66
  • [9] Databases on transcriptional regulation: TRANSFAC, TRRD and COMPEL
    Heinemeyer, T
    Wingender, E
    Reuter, I
    Hermjakob, H
    Kel, AE
    Kel, OV
    Ignatieva, EV
    Ananko, EA
    Podkolodnaya, OA
    Kolpakov, FA
    Podkolodny, NL
    Kolchanov, NA
    [J]. NUCLEIC ACIDS RESEARCH, 1998, 26 (01) : 362 - 367
  • [10] COMPILATION AND ANALYSIS OF BACILLUS-SUBTILIS SIGMA(A)-DEPENDENT PROMOTER SEQUENCES - EVIDENCE FOR EXTENDED CONTACT BETWEEN RNA-POLYMERASE AND UPSTREAM PROMOTER DNA
    HELMANN, JD
    [J]. NUCLEIC ACIDS RESEARCH, 1995, 23 (13) : 2351 - 2360