Statistical significance of clusters of motifs represented by position specific scoring matrices in nucleotide sequences

被引:88
作者
Frith, MC
Spouge, JL
Hansen, U
Weng, ZP
机构
[1] Boston Univ, Bioinformat Program, Boston, MA 02215 USA
[2] Boston Univ, Dept Biomed Engn, Boston, MA 02215 USA
[3] Boston Univ, Dept Biol, Boston, MA 02215 USA
[4] Natl Lib Med, Natl Ctr Biotechnol Informat, Bethesda, MD 20894 USA
关键词
D O I
10.1093/nar/gkf438
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The human genome encodes the transcriptional control of its genes in clusters of cis-elements that constitute enhancers, silencers and promoter signals. The sequence motifs of individual cis-elements are usually too short and degenerate for confident detection. In most cases, the requirements for organization of cis-elements within these clusters are poorly understood. Therefore, we have developed a general method to detect local concentrations of cis-element motifs, using predetermined matrix representations of the cis-elements, and calculate the statistical significance of these motif clusters. The statistical significance calculation is highly accurate not only for idealized, pseudo-random DNA, but also for real human DNA. We use our method 'cluster of motifs E-value tool' (COMET) to make novel predictions concerning the regulation of genes by transcription factors associated with muscle. COMET performs comparably with two alternative state-of-the-art techniques, which are more complex and lack E-value calculations. Our statistical method enables us to clarify the major bottleneck in the hard problem of detecting cis-regulatory regions, which is that many known enhancers do not contain very significant clusters of the motif types that we search for. Thus, discovery of additional signals that belong to these regulatory regions will be the key to future progress.
引用
收藏
页码:3214 / 3224
页数:11
相关论文
共 73 条
  • [11] Identification and characterization of a critical CP2-binding element in the human interleukin-4 promoter
    Casolaro, V
    Keane-Myers, AM
    Swendeman, SL
    Steindler, C
    Zhong, FM
    Sheffery, M
    Georas, SN
    Ono, SJ
    [J]. JOURNAL OF BIOLOGICAL CHEMISTRY, 2000, 275 (47) : 36605 - 36611
  • [12] Chen CY, 1996, DEV GENET, V19, P119, DOI 10.1002/(SICI)1520-6408(1996)19:2<119::AID-DVG3>3.0.CO
  • [13] 2-C
  • [14] 14-3-3τ associates with and activates the MEF2D transcription factor during muscle cell differentiation
    Choi, SJ
    Park, SY
    Han, TH
    [J]. NUCLEIC ACIDS RESEARCH, 2001, 29 (13) : 2836 - 2842
  • [15] Claverie JM, 1996, COMPUT APPL BIOSCI, V12, P431
  • [16] From bioinformatics to computational biology
    Claverie, JM
    [J]. GENOME RESEARCH, 2000, 10 (09) : 1277 - 1279
  • [17] A statistical model for locating regulatory regions in genomic DNA
    Crowley, EM
    Roeder, K
    Bina, M
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) : 8 - 14
  • [18] CART classification of human 5′ UTR sequences
    Davuluri, RV
    Suzuki, Y
    Sugano, S
    Zhang, MQ
    [J]. GENOME RESEARCH, 2000, 10 (11) : 1807 - 1816
  • [19] The DNA sequence and comparative analysis of human chromosome 20
    Deloukas, P
    Matthews, LH
    Ashurst, J
    Burton, J
    Gilbert, JGR
    Jones, M
    Stavrides, G
    Almeida, JP
    Babbage, AK
    Bagguley, CL
    Bailey, J
    Barlow, KF
    Bates, KN
    Beard, LM
    Beare, DM
    Beasley, OP
    Bird, CP
    Blakey, SE
    Bridgeman, AM
    Brown, AJ
    Buck, D
    Burrill, W
    Butler, AP
    Carder, C
    Carter, NP
    Chapman, JC
    Clamp, M
    Clark, G
    Clark, LN
    Clark, SY
    Clee, CM
    Clegg, S
    Cobley, VE
    Collier, RE
    Connor, R
    Corby, NR
    Coulson, A
    Coville, GJ
    Deadman, R
    Dhami, P
    Dunn, M
    Ellington, AG
    Frankland, JA
    Fraser, A
    French, L
    Garner, P
    Grafham, DV
    Griffiths, C
    Griffiths, ND
    Gwilliam, R
    [J]. NATURE, 2001, 414 (6866) : 865 - U3
  • [20] Localization of Xenopus Vg1 mRNA by vera protein and the endoplasmic reticulum
    Deshler, JO
    Highett, MI
    Schnapp, BJ
    [J]. SCIENCE, 1997, 276 (5315) : 1128 - 1131